Closed akuchling closed 13 years ago
The attached test program shows how parsing an e-mail message with the email package, then converting the resulting message to a string, fails to round-trip properly. Instead it breaks the encoding of the subject line.
The root of the problem: the subject is RFC-2047 quoted, long enough to require line wrapping, and it contains one of the splitchars used by Header.encode() -- meaning a semi-colon or comma. In my example, this is:
Subject: =?utf-8?Q?2010_Foundation_Salary_and_Benefits_Report;_Important_Legislative_Efforts?=
Parsing the message turns that into a string S. generator.Generator._write_headers() then outputs Header(S).encode(), so it keeps treating the value as an ASCII string, and therefore breaks the header at the semicolon, resulting in:
Subject: =?utf-8?Q?2010_Foundation_Salary_and_Benefits_Report;\<NEWLINE>\<SPACE>_Important_Legislative_Efforts?=
Newline and space aren't legal in Q encoding, so MUAs give up and display all the =?utf-8?Q? stuff.
The attached patch is a possible fix; it uses the decode_header() and make_header() functions to figure out the encoding properly; it fixes my example, at least. But does it increase the odds of crashing on messages with malformed headers? Should it go into 2.7 given that we're at the RC stage? What about 2.6?
(BTW, Barry, I noticed this because messages being sent through Mailman were coming out with broken subject lines. The system generating the messages is slightly weird -- doing the UTF-8 quoting is unnecessary since the subject contains no special characters -- but I think Mailman shouldn't be breaking subject lines. I haven't verified that this Python fix actually fixes Mailman, but I think this is a Python bug, not a Mailman bug.)
Minor fix to the patch: the import of Header could actually be removed, since the class is no longer referenced at all with this change.
This is fixed in 3.2/3.3 by the fix for bpo-11492. The suggested fix for 2.7 is more radical than I'm comfortable with for a point release. I'm open to argument on that, but in the meantime I'm closing the issue with 11492 as the superseder.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = 'https://github.com/bitdancer' closed_at =
created_at =
labels = ['type-bug', 'library']
title = 'Straightforward usage of email package fails to round-trip'
updated_at =
user = 'https://github.com/akuchling'
```
bugs.python.org fields:
```python
activity =
actor = 'r.david.murray'
assignee = 'r.david.murray'
closed = True
closed_date =
closer = 'r.david.murray'
components = ['Library (Lib)']
creation =
creator = 'akuchling'
dependencies = []
files = ['17410', '17411']
hgrepos = []
issue_num = 8769
keywords = ['patch']
message_count = 4.0
messages = ['106100', '106101', '106102', '133972']
nosy_count = 3.0
nosy_names = ['barry', 'akuchling', 'r.david.murray']
pr_nums = []
priority = 'normal'
resolution = 'duplicate'
stage = 'resolved'
status = 'closed'
superseder = '11492'
type = 'behavior'
url = 'https://bugs.python.org/issue8769'
versions = ['Python 3.1', 'Python 2.7', 'Python 3.2', 'Python 3.3']
```