Pylons / waitress

Waitress - A WSGI server for Python 3
https://docs.pylonsproject.org/projects/waitress/en/latest/
Other
1.44k stars 164 forks source link

Uncaptured exception instead of 400 when receiving non-ascii bytes in request url #64

Closed dpetukhov closed 7 years ago

dpetukhov commented 10 years ago

Hi.

I have waitress installed on my production server and it generates strange exceptions. Like this:

uncaptured python exception, closing channel <waitress.channel.HTTPChannel connected 127.0.0.1:54006 at 0xb034478c>
(<class 'UnicodeDecodeError'>:'ascii' codec can't decode byte 0xd0 in position 124: ordinal not in range(128)
[/usr/lib/python3.2/asyncore.py|read|83]
[/usr/lib/python3.2/asyncore.py|handle_read_event|449]
[/home/novo/py32/libpython3.2/site-packages/waitress/channel.py|handle_read|175]
[/home/novo/py32/lib/python3.2/site-packages/waitress/channel.py|received|192]
[/home/novo/py32/lib/python3.2/site-packages/waitress/parser.py|received|102]
[/home/novo/py32/lib/python3.2/site-packages/waitress/parser.py|parse_header|206]
[/home/novo/py32/lib/python3.2/site-packages/waitress/parser.py|split_uri|254]
[/usr/lib/python3.2/urllib/parse.py|urlsplit|314]
[/usr/lib/python3.2/urllib/parse.py|_coerce_args|101]
[/usr/lib/python3.2/urllib/parse.py|_decode_args|85]
[/usr/lib/python3.2/urllib/parse.py
<genexpr>|85])

I found that these exceptions generated when some stupid bot sends request with raw bytes \xd0 in http header in first line. Here is a line from nginx log (nginx proxy queries to waitress):

54.209.115.31 - myhost.ru - [03/Jul/2014:15:41:46 +0000] "GET /lands/lands/%D0%B1%D0%B0%D0%B9%D0%BA%D0%B0%D0%BB/%D0%BC%D0%B0%D0%BB%D0%BE%D0%B5-%D0%BC%D0%BE%D1%80%D0%B5/lands/\xD0\xB1\xD0\xB0\xD0\xB9\xD0\xBA\xD0\xB0\xD0\xBB/\xD0\xBC\xD0\xB0\xD0\xBB\xD0\xBE\xD0\xB5-\xD0\xBC\xD0\xBE\xD1\x80\xD0\xB5/\xD0\xB7\xD0\xB0\xD0\xBB\xD0\xB8\xD0\xB2-\xD0\xBA\xD1\x83\xD1\x80\xD0\xBA\xD1\x83\xD1\x82\xD1\x81\xD0\xBA\xD0\xB8\xD0\xB9/\xD0\xBA\xD1\x83\xD1\x80\xD0\xBA\xD1\x83\xD1\x82/id330/ HTTP/1.1" 502 172 "-" "" "-"

As you can see, here the client sent both urlescaped and raw bytes in GET. According to RFC, url must by escaped. But nginx doesn't care.

I also wrote simple POC to demonstrate the issue:

def send_xdo(host, port=80):
    s = socket.create_connection((host, port))
    req = b"GET /\xd0 HTTP/1.1\n\n"
    s.send(req)
    s.recv(4096)
    s.close()

send_xdo('localhost', 6543)

I think Waitress should be able to handle this exception. It will be better just send 400 Bad Request to the client. Take a look at the rfc - http://tools.ietf.org/html/rfc2616#section-5.1.2

The Request-URI is transmitted in the format specified in section 3.2.1. If the Request-URI is encoded using the "% HEX HEX" encoding [42], the origin server MUST decode the Request-URI in order to properly interpret the request. Servers SHOULD respond to invalid Request-URIs with an appropriate status code.

mcdonc commented 10 years ago

FWIW, as a note to myself, the UnicodeDecodeError doesn't occur under Python 2, only under Python 3.

Flimm commented 7 years ago

What about modifying the test in the way that I suggested in this pull request #162 ?

Flimm commented 7 years ago

Thank you @bertjwregeer !

To be clear to anyone lurking, the fix has been merged but not yet released on PyPI.

Flimm commented 6 years ago

waitress 1.1.0 has been released on PyPI, and it includes this fix.