python / cpython

The Python programming language
https://www.python.org
Other
62k stars 29.81k forks source link

Security hole in wsgiref.headers.Headers #55880

Open c58a5bd9-14e3-4a94-be97-96c7404dab0f opened 13 years ago

c58a5bd9-14e3-4a94-be97-96c7404dab0f commented 13 years ago
BPO 11671
Nosy @pjeby, @vstinner, @tiran, @vadmium, @epicfaace
PRs
  • python/cpython#15299
  • Files
  • header_newlines_tip.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-security', '3.8', '3.7', 'library', '3.9'] title = 'Security hole in wsgiref.headers.Headers' updated_at = user = 'https://bugs.python.org/FelixGrbert' ``` bugs.python.org fields: ```python activity = actor = 'epicfaace' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'Felix.Gr\xc3\xb6bert' dependencies = [] files = ['29238'] hgrepos = [] issue_num = 11671 keywords = ['patch'] message_count = 10.0 messages = ['132080', '132132', '132393', '182766', '182781', '182782', '182973', '182976', '195487', '195491'] nosy_count = 8.0 nosy_names = ['pje', 'vstinner', 'christian.heimes', 'Arfrever', 'devin', 'Felix.Gr\xc3\xb6bert', 'martin.panter', 'epicfaace'] pr_nums = ['15299'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'security' url = 'https://bugs.python.org/issue11671' versions = ['Python 2.7', 'Python 3.5', 'Python 3.6', 'Python 3.7', 'Python 3.8', 'Python 3.9'] ```

    c58a5bd9-14e3-4a94-be97-96c7404dab0f commented 13 years ago

    As noted by security@python.org's response I'm filing this bug here.

    In wsgiref.headers.Headers it is possible to include headers which contain a newline (i.e. \n or \r) either through addheader or \_init__. It is not uncommon that developers provide web applications to the public in which the HTTP response headers are not filtered for newlines but are controlled by the user. In such scenarios a malicious user can use a newline to inject another header or even initiate a HTTP response body. The impact would be at least equivalent to XSS. Therefore, I suggest to filter/warn/except header tuples which contain the above characters upon assignment in wsgiref.headers.

    afcdb1a8-bc5d-4852-a2a5-71b4ae6361ae commented 13 years ago

    It is not uncommon that developers provide web applications to the public in which the HTTP response headers are not filtered for newlines but are controlled by the user.

    Really? Which applications, and which response headers?

    Therefore, I suggest to filter/warn/except header tuples which contain the above characters upon assignment in wsgiref.headers.

    Applications that send them are not WSGI compliant anyway, since the spec forbids control characters in header strings -- and wsgiref.validate already validates this.

    Still, I'm not aware of any legitimate use case for apps sending user input as an HTTP header where the data wouldn't already be escaped in some fashion -- cookies, URLs, ...?

    c58a5bd9-14e3-4a94-be97-96c7404dab0f commented 13 years ago

    If the spec forbids control characters in headers, the module should enforce that.

    The most frequent example of header injection is the redirect-case: an application is forwarding using the Location header to a user-supplied URL. http://google.com/codesearch?as_q=self.redirect%5C%28self.request.get Other examples are proxies, setting user-agent, or, as you mention, custom set-cookies headers.

    2fb7f418-2576-4ac8-a7c4-99ab5ffbee45 commented 11 years ago

    Should now be compliant with this part of the spec:

    "Each header_value must not include any control characters, including carriage returns or linefeeds, either embedded or at the end. (These requirements are to minimize the complexity of any parsing that must be performed by servers, gateways, and intermediate response processors that need to inspect or modify response headers.)"

    2fb7f418-2576-4ac8-a7c4-99ab5ffbee45 commented 11 years ago

    backported patch to 2.7

    2fb7f418-2576-4ac8-a7c4-99ab5ffbee45 commented 11 years ago

    backported patch to 2.6

    vstinner commented 11 years ago

    + if bad_header_value_re.search(_value): + error_str = "Bad header value: {0!r} (bad char: {1!r})" + raise AssertionError(error_str.format( + _value, bad_header_value_re.search(_value).group(0)))

    Why do you search the character twice? You can do something like:

    match = bad_header_value_re.search(_value)
    if match is not None:
      ... match..group(0) ...

    Why do you only check value? You should also check _params:

    parts = "; ".join(parts) match = bad_header_value_re.search(parts) ...

    And you should also check the name.

    Should we do the same checks in httplib?

    2fb7f418-2576-4ac8-a7c4-99ab5ffbee45 commented 11 years ago

    The spec doesn't say anything about the header name. It probably should though, as the same issue exists there.

    I used two searches because that's how it's done in wsgiref.validate, and it's not a huge deal to do that because the second one will only execute when there's an error. That said, I changed it to how you proposed.

    Here's another stab at that patch.

    tiran commented 11 years ago

    What do the RFCs for RFC-822 and HTTP 1.1 say about \r and \n in header names?

    2fb7f418-2576-4ac8-a7c4-99ab5ffbee45 commented 11 years ago

    It looks like it's allowed for header line continuation.

    http://www.ietf.org/rfc/rfc2616.txt

    HTTP/1.1 header field values can be folded onto multiple lines if the continuation line begins with a space or horizontal tab. All linear white space, including folding, has the same semantics as SP. A recipient MAY replace any linear white space with a single SP before interpreting the field value or forwarding the message downstream.

    ...

    A CRLF is allowed in the definition of TEXT only as part of a header field continuation. It is expected that the folding LWS will be replaced with a single SP before interpretation of the TEXT value.