SpamScope / mail-parser

Tokenizer for raw mails
https://pypi.python.org/pypi/mail-parser
Apache License 2.0
367 stars 87 forks source link

Encoded Header Parsing is Different to Email Clients #83

Closed bakrill closed 3 years ago

bakrill commented 3 years ago

Describe the bug Encoded headers have spaces stripped depending on how they are encoded.

To Reproduce Steps to reproduce the behavior:

  1. import mailparser
  2. mail = mailparser.parse_from_file(f)
  3. print(mail.subject)

Expected behavior In email clients, the characters are displayed with a space between them. E.g.

image

Raw mail https://gist.github.com/bakrill/0f0237e506beb4d507a6a1a63b4d5e5c

Environment:

Additional context I think it originates from https://github.com/SpamScope/mail-parser/blob/develop/mailparser/utils.py#L137. Decoded header parts have surrounding whitespace stripped and are then concatenated.

fedelemantuano commented 3 years ago

Fixed. It will be in next release.