python / cpython

The Python programming language
https://www.python.org
Other
62.14k stars 29.86k forks source link

multipart/related header causes false positive StartBoundaryNotFoundDefect and MultipartInvariantViolationDefect #80407

Open 164a7bc6-8686-4eed-9de5-3276f8858e6a opened 5 years ago

164a7bc6-8686-4eed-9de5-3276f8858e6a commented 5 years ago
BPO 36226
Nosy @warsaw, @bitdancer, @vadmium, @maxking, @tzickel, @grische
PRs
  • python/cpython#12214
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['3.7', 'type-bug', 'library'] title = 'multipart/related header causes false positive StartBoundaryNotFoundDefect and MultipartInvariantViolationDefect' updated_at = user = 'https://github.com/grische' ``` bugs.python.org fields: ```python activity = actor = 'ned.deily' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'cschmidbauer' dependencies = [] files = [] hgrepos = ['381'] issue_num = 36226 keywords = ['patch'] message_count = 5.0 messages = ['337395', '337401', '337444', '337661', '351114'] nosy_count = 6.0 nosy_names = ['barry', 'r.david.murray', 'martin.panter', 'maxking', 'tzickel', 'cschmidbauer'] pr_nums = ['12214'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue36226' versions = ['Python 3.7'] ```

    164a7bc6-8686-4eed-9de5-3276f8858e6a commented 5 years ago

    The current implementation of multipart/related in urllib triggers header defects even though the headers are valid: [StartBoundaryNotFoundDefect(), MultipartInvariantViolationDefect()]

    The example header is valid according to RFC 2387 (https://tools.ietf.org/html/rfc2387):

    Content-Type: multipart/related; boundary="==="
    

    Both defects are triggered by the fact that httplib only passes on headers to the underlying email parser, while the email parser assumes to receive a full message. The simple fix is to tell the underlying email parser that we are only passing the header: 0a89fc15c93c271eb08e46e2cda9a72adb97d633

    The second issue is related, but independent: The underlying email parser checks if the parsed message is of type multipart by checking of the object "root" is of type list. As we only passed the header (and set headersonly=True), the check does makes no sense anymore at this point, creating a false positive: fdc7c47b77e330a36255fd00dc36accd72824e5b

    164a7bc6-8686-4eed-9de5-3276f8858e6a commented 5 years ago

    Apologies, here are the correct commit IDs:

    https://github.com/python/cpython/commit/89285439c7f94a3e62cee3d15e343218903c2af8

    https://github.com/python/cpython/pull/12214/commits/a82e662ab3339072d7b86a8090989fba60ef9c37

    vadmium commented 5 years ago

    Probably the same as bpo-29353. I remember than enabling "headersonly" can create inconsistencies in the message object. But I don't remember the details.

    According to bpo-29991 (another duplicate), my patch for bpo-24363 might help. But I don't think I got much enthusiasm reviewing that (and I don't have time to spend on it now).

    164a7bc6-8686-4eed-9de5-3276f8858e6a commented 5 years ago

    @martin.panter: I see a relation to bpo-29353, but I don't see why this report here is a duplicate. Could you elaborate on this?

    bpo-29991 contains parts of what I reported here, but it is closed "resolved" and refers back to 29353.

    I also tried your patch "policy-flag.patch" and it did not help in the regard of the bug here and tests which are included in the PR.

    e888b79e-0010-4e57-af49-96f92d830757 commented 5 years ago

    It should be noted that this causes a big headache for users of requests / urllib3 / etc... as those print on each multipart response a logging warning based on this bug, and it might cause people to go try debugging valid code:

    https://github.com/urllib3/urllib3/issues/800

    https://github.com/psf/requests/issues/3001

    https://github.com/diyan/pywinrm/issues/269

    https://github.com/jborean93/pypsrp/issues/39

    https://github.com/home-assistant/home-assistant/pull/17042

    https://github.com/Azure/azure-storage-python/issues/167

    and others....