python / cpython

The Python programming language
https://www.python.org
Other
62.47k stars 29.99k forks source link

email.Message.get_params decodes only first one header value #73864

Closed 5026f48e-ea5d-4b39-bae8-51ce19b208c6 closed 7 years ago

5026f48e-ea5d-4b39-bae8-51ce19b208c6 commented 7 years ago
BPO 29678
Nosy @warsaw, @bitdancer, @andrewnester
PRs
  • python/cpython#394
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = created_at = labels = ['type-feature', 'invalid', 'expert-email'] title = 'email.Message.get_params decodes only first one header value' updated_at = user = 'https://bugs.python.org/pi314159' ``` bugs.python.org fields: ```python activity = actor = 'r.david.murray' assignee = 'none' closed = True closed_date = closer = 'r.david.murray' components = ['email'] creation = creator = 'pi314159' dependencies = [] files = [] hgrepos = [] issue_num = 29678 keywords = [] message_count = 3.0 messages = ['288720', '288810', '288814'] nosy_count = 4.0 nosy_names = ['barry', 'r.david.murray', 'andrewnester', 'pi314159'] pr_nums = ['394'] priority = 'normal' resolution = 'not a bug' stage = 'resolved' status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue29678' versions = ['Python 3.6'] ```

    5026f48e-ea5d-4b39-bae8-51ce19b208c6 commented 7 years ago

    email.Message class has method get_params() that can decode(unquote) header values in compliance with RFC2231 and RFC2047. But if in email message exists multiple headers with the same key it can't be used to decode other headers than first. In my application I could use: headers = message.items() for key, value in headers: cleanValue = message.get_params(value=value) print(key, cleanValue) Also have posted question on stackoverflow: http://stackoverflow.com/questions/42502312/python-3-email-package-how-decode-given-header-value

    85518756-5bea-4efa-9ce9-5daf552d31b1 commented 7 years ago

    Thanks for reporting! Just added PR fixing this.

    bitdancer commented 7 years ago

    Thanks for the response, but I do not believe that this is a bug.

    The python3 email package will decode the headers automatically if you use the new policies, so if you iterate through the headers, you'll get the decoded versions, with access to the parms dict for each. (For custom headers you will have to register the appropriate header parser, but it should automatically handle all the standard mime headers.)

    For the compat32 policy (which is the default), there is indeed no easy way to do the same, but that isn't a bug, because any header that contains parameters (MIME headers) is supposed to be unique.

    As for the stackoverflow question, see above for the RFC 2231 issue (the headers are unique). For doing RFC 2047, decode_header does that. With the compat32 policy it is awkward, but documented, how to apply the functions in the header submodule to decode an arbitrary string. For the new policies you don't have to think about it; as I said above the decoding is done automatically (all unknown headers are treated as unstructured and rfc2047 decoding is done automatically).