python / cpython

The Python programming language
https://www.python.org
Other
62.86k stars 30.11k forks source link

imaplib: incorrect quoting in commands #40038

Open 08a33928-f154-4664-9dd9-f081c581f20d opened 20 years ago

08a33928-f154-4664-9dd9-f081c581f20d commented 20 years ago
BPO 917120
Nosy @warsaw, @bitdancer, @soltysh, @xtsimpouris
Files
  • adjust-mustquote-re-in-imaplib.patch: exactly what dmbaggett described
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['easy', 'type-bug', 'library', 'expert-email'] title = 'imaplib: incorrect quoting in commands' updated_at = user = 'https://bugs.python.org/anadelonbrin' ``` bugs.python.org fields: ```python activity = actor = 'xtsimpouris' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)', 'email'] creation = creator = 'anadelonbrin' dependencies = [] files = ['18367'] hgrepos = [] issue_num = 917120 keywords = ['patch', 'easy'] message_count = 11.0 messages = ['20244', '20245', '87655', '87656', '112741', '112976', '114330', '181950', '181957', '181977', '408860'] nosy_count = 10.0 nosy_names = ['barry', 'anadelonbrin', 'dmbaggett', 'r.david.murray', 'meatballhat', 'maciej.szulik', 'Mauro.Cicognini', 'dveeden', 'bjshan', 'xtsimpouris'] pr_nums = [] priority = 'low' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue917120' versions = ['Python 3.3'] ```

    08a33928-f154-4664-9dd9-f081c581f20d commented 20 years ago

    imaplib incorrectly chooses to quote some arguments.

    In particular, doing "UID FETCH # BODY.PEEK[]" results in the BODY.PEEK[] being quoted, and it should not (according to the RFC), which means the command fails. This is demonstrated below. It's possible (and likely) that other UID FETCH arguments are incorrectly quoted.

    This occurs with anon cvs python of 16/3/04, and 2.3.3. Windows XP SP1.

    I'm happy to provide more info if required, just let me know. I could try and work up a patch, but it would be better from someone really familiar with imaplib so that I don't screw up legitimate quoting.

    >>> import imaplib
    >>> i = imaplib.IMAP4("server")
    >>> i.login("username", "password")
    ('OK', ['LOGIN Ok.'])
    >>> i.select()
    ('OK', ['38'])
    >>> i.debug = 4
    >>> i.uid("FETCH", "96", "BODY")
      29:14.23 > GKGP7 UID FETCH 96 BODY
      29:14.40 < * 31 FETCH (UID 96 BODY (("text" "plain"
    ("charset" "iso-8859-1") NIL NIL "quoted-printable" 32
    0)("text" "html" ("charset" "iso-8859-1") NIL NIL
    "quoted-printable" 368 10) "alternative"))
      29:14.40 < GKGP7 OK FETCH completed.
    ('OK', ['31 (UID 96 BODY (("text" "plain" ("charset"
    "iso-8859-1") NIL NIL "quoted-printable" 32 0)("text"
    "html" ("charset" "iso-8859-1") NIL NIL
    "quoted-printable" 368 10) "alternative"))'])
    >>> i.uid("FETCH", "96", "BODY.PEEK[]")
      29:17.04 > GKGP8 UID FETCH 96 "BODY.PEEK[]"
      29:17.21 < GKGP8 NO Error in IMAP command received by
    server.
      29:17.21 NO response: Error in IMAP command received
    by server.
    ('NO', ['Error in IMAP command received by server.'])
    >>> i.logout()
      29:31.26 > GKGP9 LOGOUT
      29:31.42 < * BYE Courier-IMAP server shutting down
      29:31.42 BYE response: Courier-IMAP server shutting down
      29:31.42 < GKGP9 OK LOGOUT completed
    ('BYE', ['Courier-IMAP server shutting down'])
    >>>
    08a33928-f154-4664-9dd9-f081c581f20d commented 20 years ago

    Logged In: YES user_id=552329

    Sorry, I missed the bit in the docs that points out that stuff is always quoted and that using () avoids it. Still, it does seem that imaplib would be doing it's job better if it followed correct quoting, rather than always quoting. It would certainly be easier to use for people familiar with IMAP, but unfamiliar with imaplib.

    5923ffca-9b18-4f3f-9d7f-4960900ce166 commented 15 years ago

    I'm not sure this causes the behavior reported here, but I believe there really is a bug in imaplib.

    In particular, it seems wrong to me that this line:

    mustquote = re.compile(r"[^\w!#$%&'*+,.:;<=>?^`|~-]")

    has \w in it. Should that be \s?

    I found this when I noticed that SELECT commands on mailboxes with spaces in their names failed.

    5923ffca-9b18-4f3f-9d7f-4960900ce166 commented 15 years ago

    OK, I missed the initial caret in the regex. The mustquote regex is listing everything that needn't be quoted, and then negating. I still think it's wrong, though. According to BNF given in the Formal Syntax section of RFC 3501, you must must quote atom-specials, which are defined thus:

    atom-specials = "(" / ")" / "{" / SP / CTL / list-wildcards / quoted-specials / resp-specials list-wildcards = "%" / "*" quoted-specials = DQUOTE / "\" resp-specials = "]"

    So I think this regex should do it:

    mustquote = re.compile(r'[()\s%*"]|"{"|"\\"|"\]"')

    Changing status to bug.

    81bc2b8c-b983-4a8c-9f1b-948d88b498e7 commented 14 years ago

    I'm attaching a patch which does exactly what dmbaggett recommended w.r.t. the mustquote regex. All current tests pass, but I'm not sure if the current tests even cover this code (how is coverage measured in the stdlib tests?)

    On a related note, the _checkquote method which uses the mustquote regex is dead code in Python 3.2+, AFAICT.

    5923ffca-9b18-4f3f-9d7f-4960900ce166 commented 14 years ago

    Piers Lauder, author of imaplib, emailed me the following comment about this bug: ------------ The regex for "mustquote_cre" looks bizarre, and I regret to say I can no longer remember its genesis.

    Note however, that the term CTL in the RFC definition for "atom-specials" means "any ASCII control character and DEL, 0x00 - 0x1f, 0x7f", and so maybe defining what is NOT an atom-special was considered easier.

    The suggested replacement regex may not match these...? -------------

    It seems like we need to enumerate the control characters in the regex to be absolutely correct here.

    83d2e70e-e599-4a04-b820-3814bbdb9bef commented 14 years ago

    I agree with Dan's comment in msg112741 about dead code so this only applies to 2.7. I'll raise a new issue to get the dead code removed from py3k.

    cdb055e7-c88d-4176-bafa-417ca6390e87 commented 11 years ago

    The removal of the dead code causes imaplib under py3k to lose the quoting functionality that is described in documentation (except for passwords, that do get always quoted as stated).

    I submit that we give at least a temporary warning in the docs, and that the _quote() function is exposed to let the programmer do their own quoting when they feel it necessary.

    bitdancer commented 11 years ago

    I don't understand what you mean by removing dead code leading to loss of functionality, unless you mean that the removal of the call to the quoting code in Python3 led to a loss of functionality relative to Python2, in which case I agree. It also led to the behavior being out of sync with the documentation.

    Reviewing the history of the changes (see bpo-1210), it appears to me that the removal of the call to _checkquote was in fact unintentional and is a bug (it existed in the port-to-python3 patch submitted by Victor, but was commented out, which probably means he had it commented out for testing and did not notice he had not restored it). This makes the later removal of _checkquote incorrect as well.

    This error is primarily a consequence of imaplib having very few tests.

    This module needs a lot of love in Python3 from someone, and it is not an easy topic to wrap ones head around. It's on my list of things to look at, but there are a bunch of things ahead of it.

    For the immediate issue, it is working as documented in Python2.7, so there is nothing to do there. For Python3 we haven't had the quoting since the start, so we have the opportunity to consider changing the quoting rules if we wish...and we may have have no choice, since the new behavior has been in released versions for several Python3 versions now and starting to quote like Python2.7 did might break otherwise working code. I don't have an opinion on how to fix this this yet, since while I know more about the IMAP protocol than I did a year ago, I still don't know enough to even write the tests....

    cdb055e7-c88d-4176-bafa-417ca6390e87 commented 11 years ago

    David, that is exactly what I meant: functionality for Python 3 is less than the functionality available for Python 2, and behavior is completely out of sync with the documentation.

    Bug or not, and independent of the root cause (I don't know if anyone will ever come up with more or better tests), I agree that this behavior for Python 3 has been around for a long time. I don't think it will break any significant code, but that's just my personal opinion; the main point is that re-introducing quoting functionality in imaplib would be a significant effort, also because reading the other relevant threads it is apparent that correct implementation is not there in Python 2 either.

    I think that we have an easy way: 1) fix the documentation for Python 3 so that it reflects behavior and 2) expose the _quote() private method with a new static function that the user will be able to call when they deem it necessary. They get to come up with a good regex or other ways to detect strings that need quoting, but still it's better than relying (like I did) on the advertised functionality and having to dig into the bowels of the module to understand the source of some bizarre bugs...

    7d693bb6-b491-4924-9076-07729e800947 commented 2 years ago

    As far as I can see I have the same problem from a different perspective. Selecting folders that include a space result in broken use of API, as server responds with "folder is not dound" (which is valid as part of the folder name is never sent). To circumvent the problem I suggest using a patched version of _command, where if name is "select" or "examine", check if arg startswith ", and if not wrap it - locally this seems to fix my problem. A different approach would be to force user use _quote function when needed (also fixes my problem locally).