Closed pali closed 8 years ago
Thank you!
I found a difference in the way perl 5.20.3 and 5.24.1 are encoding a test string that's being used in an email From: header. It's not clear why this is breaking some downstream processing of email (which may be a separate bug), however I'm curious whether both outputs are technically correct:
#this code outputs 2 different things on 5.20.3 and 5.24.1
my $perl_str = '¥Test User <testuser@example.com>';
my $encoded_str = encode("MIME-Header", $perl_str);
print $encoded_str;
# 5.24.1 output:=?UTF-8?B?w4LCpVRlc3QgVXNlciA8dGVzdHVzZXJAZXhhbXBsZS5jb20+?=
# 5.20.3 output:=?UTF-8?B?w4LCpVRlc3QgVXNlciA=?=<testuser@example.com>
I was reading through the RFC http://www.faqs.org/rfcs/rfc2047.html and it seems like the definition of "encoded-word" may not be adhered to in this rewrite of the module. Is this the correct place to post comments on this or should I file a github issue? I don't interact with perl much so I'm not sure what the best way to flag an issue is.
@amit777 Output =?UTF-8?B?w4LCpVRlc3QgVXNlciA=?=<testuser@example.com>
is incorrect because as per RFC2047 there must be space between ...A=?=
and <testu...>
. Output from perl 5.24.1 is technically correct per RFC2047, but not suitable for From
header. As From
header is structured and has special grammar, generic module like this MIME-Header
cannot be used for it. If you look into updated documentation for Email::MIME::Header module you should see that this module is for unstructured email headers or RFC822 'text' token. Note that it is not possible to write "generic" module which will work for any structured email header as module itself does not know according to which structure should be whole email header encoded.
So if you want to MIME encode From header (or To/CC/Bcc/) then you first need to split it into RFC822 'text' tokens. Then MIMe encode each text token which is per RFC2047 allowed to encode and after that combine output to one string.
use utf8;
my $name = '¥Test User';
my $address = 'testuser@example.com';
my $encoded = encode('MIME-Header, $name) . " <$address>";
Token representing address is not possible to MIME-encode, so e.g. some check that it contains ASCII characters only should be used... Or better that validate email address, but be careful! Grammar for email address is strange, see RFC2822 for it.
Btw, if you thinking that there should be module which encode From header correctly, then look at my patches for Email::MIME https://github.com/rjbs/Email-MIME/pull/35 and my Email::Address::XS module (https://github.com/pali/Email-Address-XS). Note that Email::Address is broken.
I hope this will help you to understand whole problem about From/To/Cc/... headers and how Email-MIME was terribly broken prior Encode 2.83 (https://metacpan.org/pod/Encode::MIME::Header#BUGS). Emails generated by old perl versions were just broken and were not parsable by compliant RFC2047 parsers. This is basically not acceptable and rather breaking compatibility which fix these problems as stay with nonsense and broken encoder.
Thank you! That was incredibly informative and helpful. I'll update my code with your suggestions.
On Feb 15, 2017, at 12:40 AM, pali notifications@github.com wrote:
@amit777 Output =?UTF-8?B?w4LCpVRlc3QgVXNlciA=?=testuser@example.com is incorrect because as per RFC2047 there must be space between ...A=?= and
. Output from perl 5.24.1 is technically correct per RFC2047, but not suitable for From header. As From header is structured and has special grammar, generic module like this MIME-Header cannot be used for it. If you look into updated documentation for Email::MIME::Header module you should see that this module is for unstructured email headers or RFC822 'text' token. Note that it is not possible to write "generic" module which will work for any structured email header as module itself does not know according to which structure should be whole email header encoded. So if you want to MIME encode From header (or To/CC/Bcc/) then you first need to split it into RFC822 'text' tokens. Then MIMe encode each text token which is per RFC2047 allowed to encode and after that combine output to one string.
use utf8; my $name = '¥Test User'; my $address = 'testuser@example.com'; my $encoded = encode('MIME-Header, $name) . " <$address>"; Token representing address is not possible to MIME-encode, so e.g. some check that it contains ASCII characters only should be used... Or better that validate email address, but be careful! Grammar for email address is strange, see RFC2822 for it.
Btw, if you thinking that there should be module which encode From header correctly, then look at my patches for Email::MIME rjbs/Email-MIME#35 and my Email::Address::XS module (https://github.com/pali/Email-Address-XS). Note that Email::Address is broken.
I hope this will help you to understand whole problem about From/To/Cc/... headers and how Email-MIME was terribly broken prior Encode 2.83 (https://metacpan.org/pod/Encode::MIME::Header#BUGS). Emails generated by old perl versions were just broken and were not parsable by compliant RFC2047 parsers. This is basically not acceptable and rather breaking compatibility which fix these problems as stay with nonsense and broken encoder.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
If you need to construct strings for From/To/Cc/... headers look at my Email::Address::XS module (https://github.com/pali/Email-Address-XS). It is not on cpan yet, but I would like to header some feedback about it (if is really useful for users!). It provides everything needed and should be fast and RFC2822 correct. MIME-encoding of /phrase/ needs to be done manually, but you can pass it via encode: my $address = Email::Address::XS->new(phrase => encode('MIME-Header', $phrase), address => $address); my $value = $address->format();
I would love to try it, however it's difficult to get the module on our servers if it's not in CPAN. I will most likely switch to it based on the notes though.
I was able to install your module in my perlbrew environment.. will test and report back if it works for me or not. THank you!
It looks like Email::Address::XS works well for me. I don't have extensive test cases around it, but it seems to work as a dropin replacement for Email::Address. Would love to see this on cpan! thanks again.
I did some last fixes to Email::Address::XS and now it is on cpan: https://metacpan.org/pod/Email::Address::XS
This patch series clean up and refactor Encode::MIME::Header module.
New features:
Changes:
Please recheck if you agree with changes in POD documentation. If something needs to be extended or fixed then let me know.