DockYard / elixir-mail

Build composable mail messages
405 stars 65 forks source link

Charset handling? #78

Open johnnyshields opened 6 years ago

johnnyshields commented 6 years ago

It looks like this library handles base64 etc. encoding of the mail but not the charset of decoded mail, for example ASCII, SHIFT-JIS, ISO-2022-JP, etc. (Please correct if I'm wrong)

Ruby's Mail gem handles this nicely, refer to [the

pick_encoding method](https://github.com/mikel/mail/blob/master/lib/mail/version_specific/ruby_1_9.rb#L186)

bcardarella commented 6 years ago

There are multiple encoders: https://github.com/DockYard/elixir-mail/tree/master/lib/mail/encoders

I'm open to adding additional encoders, or even better opening the API to allow for custom encoders

johnnyshields commented 5 years ago

FYI we ended up using Erlport to call this Ruby script as a way to ensure all mails are converted to UTF-8.

require 'mail'
require 'json'

def parse_body(body)
  message = Mail.new body
  build_mime_map({}, message).to_json
end

def build_mime_map(mime_map, message)
  type = extract_type(message)
  body = extract_body(message)

  mime_map[type] = (mime_map[type] || '') + body
  message.parts.reduce(mime_map, &method(:build_mime_map))
end

def extract_type(message)
  main_type = message.main_type || 'multipart'
  sub_type  = message.sub_type || 'alternative'
  "#{main_type}/#{sub_type}"
end

def extract_body(message)
  charset = message.charset
  encoding = charset ? Mail::RubyVer.pick_encoding(charset) : 'UTF-8'
  body = message.body.decoded.force_encoding(encoding)
  body.encode('UTF-8', invalid: :replace, undef: :replace)
end