django / daphne

Django Channels HTTP/WebSocket server
BSD 3-Clause "New" or "Revised" License
2.32k stars 256 forks source link

Daphne allows invalid characters within header names #497

Closed kenballus closed 4 months ago

kenballus commented 5 months ago

The bug

RFC 9110 defines the header names must consist of a single token, which is defined as follows:

  token          = 1*tchar
  tchar          = "!" / "#" / "$" / "%" / "&" / "'" / "*"
                 / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
                 / DIGIT / ALPHA
                 ; any VCHAR, except delimiters

When Daphne receives a request with a header name containing any or all of the following characters, it does not reject the message:

\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\x0b\x0c\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f "(),/;<=>?@[/]{}\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef

Thus, when you send the following request:

GET / HTTP/1.1\r\n
\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\x0b\x0c\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f "(),/;<=>?@[/]{}\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef: whatever\r\n
\r\n

Daphne's interpretation is as follows:

[
    HTTPRequest(
        method=b'GET', uri=b'/', version=b'1.1',
        headers=[
            (b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\x0b\x0c\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f "(),/;<=>?@[/]{}\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef', b'whatever'),
        ],
        body=b'',
    ),
]

(i.e. the invalid header name made it into the request unchanged)

Nearly all other HTTP implementations reject the above request, including Uvicorn, Hypercorn, AIOHTTP, Apache httpd, Bun, CherryPy, Deno, FastHTTP, Go net/http, Gunicorn, H2O, Hyper, Jetty, libsoup, Lighttpd, Mongoose, Nginx, Node.js LiteSpeed, Passenger, Puma, Tomcat, OpenWrt uhttpd, Uvicorn, Waitress, WEBrick, and OpenBSD httpd.

Your OS and runtime environment:

$ uname -a
Linux de8a63fbb2f4 6.7.2-arch1-2 #1 SMP PREEMPT_DYNAMIC Wed, 31 Jan 2024 09:22:15 +0000 x86_64 GNU/Linux

pip freeze:

asgiref==3.7.2
attrs==23.2.0
autobahn==23.6.2
Automat==22.10.0
cffi==1.16.0
constantly==23.10.4
cryptography==42.0.2
Cython==0.29.32
daphne @ file:///app/daphne
h2==4.1.0
hpack==4.0.0
hyperframe==6.0.1
hyperlink==21.0.0
idna==3.6
incremental==22.10.0
priority==1.3.0
pyasn1==0.5.1
pyasn1-modules==0.3.0
pycparser==2.21
Pygments==2.14.0
pyOpenSSL==24.0.0
python-afl==0.7.4
PyYAML==6.0
service-identity==24.1.0
six==1.16.0
Twisted==23.10.0
txaio==23.1.1
typing_extensions==4.9.0
zope.interface==6.1

(Using Daphne built from main, with the latest commit being https://github.com/django/daphne/commit/993efe62ce9e9c1cd64c81b6ee7dfa7b819482c7 at the time of writing)

How you're running Channels (runserver? daphne/runworker? Nginx/Apache in front?)

Directly invoking Daphne from the command line.

Console logs and full tracebacks of any errors

N/A

kenballus commented 5 months ago

Of particular note is that \x00, \t, and are permitted, which have historically been sources of exploitable parsing discrepancies between origin servers and gateway servers.

carltongibson commented 5 months ago

@kenballus thanks for the report. Are you up for making a PR here?

carltongibson commented 5 months ago

h11 gives a regex for a header name here:

https://github.com/python-hyper/h11/blob/a2c68948accadc3876dffcf979d98002e4a4ed27/h11/_abnf.py#L10-L21C1

That's then validated when parsing the headers here: https://github.com/python-hyper/h11/blob/a2c68948accadc3876dffcf979d98002e4a4ed27/h11/_headers.py#L163

Gunicorn does similar here https://github.com/benoitc/gunicorn/blob/cf55d2cec277f220ebd605989ce78ad1bb553c46/gunicorn/http/message.py#L92-L93 (but matching bad headers, with what looks like a more complex regex.)

We'd handle that in the http and ws protocols:

https://github.com/django/daphne/blob/993efe62ce9e9c1cd64c81b6ee7dfa7b819482c7/daphne/http_protocol.py#L147-L158

https://github.com/django/daphne/blob/993efe62ce9e9c1cd64c81b6ee7dfa7b819482c7/daphne/ws_protocol.py#L37-L44

Need to investigate whether this should be handled by twisted. https://github.com/twisted/twisted/blob/446ee139189440e890b26a29af256e9b9d0e8eba/src/twisted/web/http_headers.py

carltongibson commented 5 months ago

h11 test examples for no weird characters in names are here:

https://github.com/python-hyper/h11/blob/a2c68948accadc3876dffcf979d98002e4a4ed27/h11/tests/test_headers.py#L24-L35

carltongibson commented 4 months ago

Should be addressed by #500.

kenballus commented 1 month ago

Thank you very much @carltongibson! I was about to start working on this just now, only to find that you've beaten me to it :)

carltongibson commented 1 month ago

You're welcome @kenballus. Thanks for the follow up. 🎁