python / cpython

The Python programming language
https://www.python.org
Other
63.42k stars 30.37k forks source link

RFC 2231 support for email package #36505

Closed 23aeb1a4-d162-4473-a32d-9adcac86cf2c closed 22 years ago

23aeb1a4-d162-4473-a32d-9adcac86cf2c commented 22 years ago
BPO 549133
Nosy @loewis, @warsaw, @phdru
Files
  • email-patch.zip: The zip file contains the patch and new file msg_26.txt (put it into src/Lib/test/data)
  • email-patch.zip
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/warsaw' closed_at = created_at = labels = ['extension-modules'] title = 'RFC 2231 support for email package' updated_at = user = 'https://github.com/phdru' ``` bugs.python.org fields: ```python activity = actor = 'barry' assignee = 'barry' closed = True closed_date = None closer = None components = ['Extension Modules'] creation = creator = 'phd' dependencies = [] files = ['4194', '4195'] hgrepos = [] issue_num = 549133 keywords = ['patch'] message_count = 10.0 messages = ['39708', '39709', '39710', '39711', '39712', '39713', '39714', '39715', '39716', '39717'] nosy_count = 3.0 nosy_names = ['loewis', 'barry', 'phd'] pr_nums = [] priority = 'normal' resolution = 'accepted' stage = None status = 'closed' superseder = None type = None url = 'https://bugs.python.org/issue549133' versions = ['Python 2.3'] ```

    23aeb1a4-d162-4473-a32d-9adcac86cf2c commented 22 years ago

    RFC 2231 defines the methods for encoding and decoding parameters in mail headers.

    This patch adds support for parameter decoding. It changes the interface of Message._get_params_preserve()

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 22 years ago

    Logged In: YES user_id=21627

    Did you test this code with non-ASCII messages?

    I discourage the use of the default encoding. Instead, if an encoding is present, a Unicode object, or the information about the original encoding should be returned. If absolutely necessary, conversion to the default encoding is acceptable if UnicodeError is caught for the encoding to the default encoding.

    I'm not sure how to deal with UnicodeErrors when constructing the Unicode object: you probably should create an exception, but have that exception carry the data that you caused the problem, so that the caller has the opportunity to process them by other means.

    23aeb1a4-d162-4473-a32d-9adcac86cf2c commented 22 years ago

    Logged In: YES user_id=4799

    Did you test this code with non-ASCII messages?

    I did.

    I discourage the use of the default encoding.

    What is the "default encoding" in this context?

    23aeb1a4-d162-4473-a32d-9adcac86cf2c commented 22 years ago

    Logged In: YES user_id=4799

    Did you test this code with non-ASCII messages?

    I did.

    I discourage the use of the default encoding.

    What is the "default encoding" in this context?

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 22 years ago

    Logged In: YES user_id=21627

    The default encoding is the one returned by sys.getdefaultencoding(). If this returns, on your system, say, 'koi-8r', then testing the patch with koi-8r is equivalent to testing it with ASCII only in a standard installation.

    In your patch, the line

      value = unicode(value[2], value[0]).encode()

    makes use of the default encoding in the .encode call; this call should always have an argument - it will fail if value[0] differs from the default encoding, and characters from the set difference between the encodings are used in value[2].

    23aeb1a4-d162-4473-a32d-9adcac86cf2c commented 22 years ago

    Logged In: YES user_id=4799

    I discourage the use of the default encoding. Instead, if an encoding is present, a Unicode object, or the information about the original encoding should be returned.

    This particular function (_formatparam) must return an ASCII string, not an Unicode object. The resulting string is put into a header.

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 22 years ago

    Logged In: YES user_id=21627

    If it really *has* to be ASCII, please be explicit about this, invoking .encode('ascii'). I still wonder whether this could raise a UnicodeError, though.

    Another comment: 'languge' is spelled incorrectly in a few places.

    23aeb1a4-d162-4473-a32d-9adcac86cf2c commented 22 years ago

    Logged In: YES user_id=4799

    .encode('ascii')

    Agree.

    languge

    Fixed.

    23aeb1a4-d162-4473-a32d-9adcac86cf2c commented 22 years ago

    Logged In: YES user_id=4799

    New patch uploaded.

    warsaw commented 22 years ago

    Logged In: YES user_id=12800

    Thanks Oleg! Sorry for the delay. I've accepted this patch and backported it to Python 2.1 (which the email package must still support). Will commit it to Python 2.3 cvs momentarily.