The message contains a sequence of bytes in the From: line that if treated as UTF-8 are malformed. (The message is encoded in base64 to preserve this, but I found the original example in a piece of spam email.)
I can retrieve it by encoding it in ASCII-8BIT (e.g. by setting the the input stream's encoding). However when I pass it to Mail.new, it fails because Mail::Message defaults to UTF-8 and parsing will attempt to convert the value, which doesn't work because of the malformed text.
This looks like a bug to me. I would have expected it to use the string's encoding as the default charset.
Adding the following seems to fix this:
diff --git a/lib/mail/message.rb b/lib/mail/message.rb
index 5c7d40ab..1d9a1e2f 100644
--- a/lib/mail/message.rb
+++ b/lib/mail/message.rb
@@ -2117,6 +2117,7 @@ module Mail
end
def init_with_string(string)
+ @charset = string.encoding.to_s if @defaulted_charset
self.raw_source = string
set_envelope_header
parse_message
(I also can work around this with Mail::Message.default_charset = 'ASCII-8BIT', but it still seems like something that could be improved.)
The following snippet attempts to parse a mail message:
The message contains a sequence of bytes in the From: line that if treated as UTF-8 are malformed. (The message is encoded in base64 to preserve this, but I found the original example in a piece of spam email.)
I can retrieve it by encoding it in ASCII-8BIT (e.g. by setting the the input stream's encoding). However when I pass it to
Mail.new
, it fails becauseMail::Message
defaults to UTF-8 and parsing will attempt to convert the value, which doesn't work because of the malformed text.This looks like a bug to me. I would have expected it to use the string's encoding as the default charset.
Adding the following seems to fix this:
(I also can work around this with
Mail::Message.default_charset = 'ASCII-8BIT'
, but it still seems like something that could be improved.)