python / cpython

The Python programming language
https://www.python.org
Other
62.27k stars 29.92k forks source link

email.message.Message.set_payload and as_string given charset 'us-ascii' plus 8bit data produces invalid message #51553

Open bitdancer opened 14 years ago

bitdancer commented 14 years ago
BPO 7304
Nosy @warsaw, @bitdancer, @mitya57

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-bug', 'expert-email'] title = "email.message.Message.set_payload and as_string given charset 'us-ascii' plus 8bit data produces invalid message" updated_at = user = 'https://github.com/bitdancer' ``` bugs.python.org fields: ```python activity = actor = 'mitya57' assignee = 'none' closed = False closed_date = None closer = None components = ['email'] creation = creator = 'r.david.murray' dependencies = [] files = [] hgrepos = [] issue_num = 7304 keywords = [] message_count = 2.0 messages = ['95133', '156629'] nosy_count = 3.0 nosy_names = ['barry', 'r.david.murray', 'mitya57'] pr_nums = [] priority = 'normal' resolution = None stage = 'test needed' status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue7304' versions = ['Python 2.7', 'Python 3.2', 'Python 3.3'] ```

bitdancer commented 14 years ago

The following produces a non-conformant message, since the us-ascii charset is strictly 7bit:

>>> import email.message
>>> m = email.message.Message()
>>> m.set_payload("""A few lines
... of 8-bit text
...
... One high bit character: ².
... """, 'us-ascii')
>>> print m.as_string()
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 8bit

A few lines of 8-bit text

One high bit character: ².

>>

bitdancer commented 12 years ago

In Python2 the fix would be to use charset unknown-8bit instead of us-ascii.

In Python3 this actually puts unicode in the message body. There we should default to utf-8, but this requires a more extensive change than the Python2 change, and probably should not be backported.

Once this is fixed in Python3 the utf-8 default check can be removed from MIMEText (bpo-14380).