zedshaw / lamson

Pythonic SMTP Application Server
http://lamsonproject.org/
Other
733 stars 199 forks source link

lamson.encoding._parse_charset_header breaks encoded multiline headers #15

Open wRAR opened 11 years ago

wRAR commented 11 years ago

In the case of headers, wrapped per RFC 2822 §2.2.3 and MIME-encoded per RFC 2047, lamson.encoding._parse_charset_header combines encoded-text parts of all encoded-words together and passes the result to lamson.encoding.apply_charset_to_header. This is clearly wrong and if some base64-encoded encoded-word is padded (i.e. ends with =), the result will be trimmed after this point on decoding. This means that in a lot of cases the parsed header value will be shorter than it should be. I'm also worried about support of headers that simultaneously include B-encoded and Q-encoded encoded-words and unencoded text which is possible per RFC 2047 §5(1).

wRAR commented 11 years ago

A simple test shows that if a header consists of a B-encoded word and a Q-encoded word separated by a space, both words are decoded properly but the space is copied into the result too, while per RFC it must be ignored. Looks like this code deserves full rewrite, this time using RFC 2047 §6.