mailgun / flanker

Python email address and Mime parsing library
http://www.mailgun.com
Apache License 2.0
1.63k stars 204 forks source link

Flanker 0.9.0: Python 2 compatibility issue due to overriding `.NL` values in Python email package #198

Closed jfly closed 6 years ago

jfly commented 6 years ago

We're using Python 2 and Flanker 0.9.0:

$ pip freeze | grep -i flanker
flanker==0.9.0
$ python
Python 2.7.12 (default, Apr  9 2018, 15:21:15) 
[GCC 7.3.1 20180312] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import email.utils; from email.header import Header
>>> print repr(str(email.header.Header(email.utils.formataddr((u'a '*50, u'example@example.com')), 'To')))
'a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a\n a a a a a a a a a a a a <example@example.com>'
>>> from flanker.addresslib import address
>>> print repr(str(email.header.Header(email.utils.formataddr((u'a '*50, u'example@example.com')), 'To')))
'a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a\r\n a a a a a a a a a a a a <example@example.com>'
>>> 

Note that before importing address, there was a \n character inserted into the string, and after importing address, there is a \r\n inserted into the string.

I don't actually know much about the email specs, but this seems like a backwards compatibility issue, even if it's addressing a bug in python 2.7.12.

Just to confirm, this doesn't happen with Flanker 0.8.5:

$ pip freeze | grep -i flanker
flanker==0.8.5
(external-api-kWcvfjzK) ➜  ~/honor/external-api git:(unpinning-flanker) ✗ python
Python 2.7.12 (default, Apr  9 2018, 15:21:15) 
[GCC 7.3.1 20180312] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import email.utils; from email.header import Header
>>> print repr(str(email.header.Header(email.utils.formataddr((u'a '*50, u'example@example.com')), 'To')))
'a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a\n a a a a a a a a a a a a <example@example.com>'
>>> from flanker.addresslib import address
INFO:flanker.addresslib._parser.parser:building mailbox parser
INFO:flanker.addresslib._parser.parser:building addr_spec parser
WARNING:flanker.addresslib._parser.parser:Symbol 'name_addr' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'mailbox_or_url_list' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'angle_addr' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'mailbox_or_url' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'delim' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'mailbox' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'url' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'phrase' is unreachable
INFO:flanker.addresslib._parser.parser:building url parser
WARNING:flanker.addresslib._parser.parser:Symbol 'domain' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'name_addr' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'mailbox_or_url_list' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'angle_addr' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'mailbox_or_url' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'local_part' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'delim' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'domain_literal_text' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'mailbox' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'quoted_string_text' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'addr_spec' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'phrase' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'quoted_string' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'domain_literal' is unreachable
INFO:flanker.addresslib._parser.parser:building mailbox_or_url parser
WARNING:flanker.addresslib._parser.parser:Symbol 'mailbox_or_url_list' is unreachable
WARNING:flanker.addresslib._parser.parser:Symbol 'delim' is unreachable
INFO:flanker.addresslib._parser.parser:building mailbox_or_url_list parser
>>> print repr(str(email.header.Header(email.utils.formataddr((u'a '*50, u'example@example.com')), 'To')))
'a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a\n a a a a a a a a a a a a <example@example.com>'
jfly commented 6 years ago

For the record, I dug into this, and this appears to be a side effect of adding support for python 3. This code: https://github.com/mailgun/flanker/blob/v0.9.0/flanker/_email.py#L29-L33 clobbers some constants named NL in Python 2's email package (changing them from \n to \r\n).

horkhe commented 6 years ago

According to RFC2822

Messages are divided into lines of characters. A line is a series of characters that is delimited with the two characters carriage-return and line-feed; that is, the carriage return (CR) character (ASCII value 13) followed immediately by the line feed (LF) character (ASCII value 10). (The carriage-return/line-feed pair is usually written in this document as "CRLF".)

We are consciously trying to ensure that CRLF is used consistently. It is mentioned in the change log. This change should not cause any real world issues. For the except of some very verbatim tests failing.

So I am closing this as "by design".