Open 8a5fd93c-2f61-42bc-83cc-c28c8e7cd129 opened 2 years ago
In various places in the email library str.splitlines
is used to split up a message where folding might take place in the original message source. This appears to be a bug because when these split parts are re-joined they are joined by a CRLF.
https://github.com/python/cpython/blob/ef5bb25e2d6147cd44be9c9b166525fb30485be0/Lib/email/header.py#L369
str.splitlines
splits on "universal newlines" which can include newlines other than the CRLF.
https://docs.python.org/3/library/stdtypes.html#str.splitlines
However, the email RFCs define folding whitespace with CRLF as the only possible newline type (optionally surrounded by WSP (SP/HTAB) and/or comments). https://datatracker.ietf.org/doc/html/rfc5322#section-3.2.2
The end result is that a message making a roundtrip through the email parser/generator is mangled because it has any non-CRLF "universal newlines" converted to CRLFs. Anything in the header after the non-CRLF "universal newline" appears on it's own line with no preceding whitespace. This appears to happen with all of the stock policies.
from email import message_from_bytes
from email.policy import SMTPUTF8
eml_bytes = b'Header-With-FS-Char: BEFORE\x1cAFTER\r\n\r\nBody\r\n'
print(eml_bytes)
message = message_from_bytes(eml_bytes, policy=SMTPUTF8)
print(message.as_bytes(policy=SMTPUTF8))
b'Header-With-FS-Char: BEFORE\x1cAFTER\r\n\r\nBody\r\n'
b'Header-With-FS-Char: BEFORE\r\nAFTER\r\n\r\nBody\r\n'
The operational impact of this mangling is that the "AFTER" text now makes the message format invalid because it is neither a valid header (no ": ") nor the valid start of a message body (only one CRLF). Common MIME-viewers (e.g. Thunderbird/Outlook) appear to interpret it as a body anyway and any subsequent headers become part of the body.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['type-bug', 'library', '3.11']
title = 'Email Header Folding Converts Non-CRLF Newlines to CRLFs'
updated_at =
user = 'https://bugs.python.org/jwalterclark'
```
bugs.python.org fields:
```python
activity =
actor = 'jwalterclark'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation =
creator = 'jwalterclark'
dependencies = []
files = []
hgrepos = []
issue_num = 46462
keywords = []
message_count = 1.0
messages = ['411171']
nosy_count = 1.0
nosy_names = ['jwalterclark']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue46462'
versions = ['Python 3.11']
```