aio-libs / aiohttp

Asynchronous HTTP client/server framework for asyncio and Python
https://docs.aiohttp.org
Other
15.07k stars 2.01k forks source link

Broken HTTP request parsing: Upgrade: h2c header leads to discarded body #8414

Closed BlobbyBob closed 2 months ago

BlobbyBob commented 5 months ago

Describe the bug

When sending two TCP segments, the first containing all headers including the final \r\n\r\n and the second containing a JSON body, the server parses them as two request. The first has an empty body while the second fails with aiohttp.http_exceptions.BadStatusLine as the body is interpreted as HTTP method

To Reproduce

Use the following as client:

docker run --rm -it --net=host ghcr.io/tls-attacker/tlsanvil worker -controller localhost:5001

Use the following as server:

import aiohttp.web

class AnvilWorkerAPI:
    def __init__(self):
        self.app = aiohttp.web.Application()
        self.setup_routes()

    def setup_routes(self):
        self.app.router.add_post("/api/v2/worker/register", self.register)
        self.app.router.add_post("/api/v2/worker/fetch", self.fetch)

    async def register(self, request: aiohttp.web.Request):
        body = await request.read()
        print(request.path, body)
        return aiohttp.web.json_response({})

    async def fetch(self, request):
        body = await request.read()
        print(request.path, body)
        return aiohttp.web.json_response({"command": "OK", "job": None})

    def run(self, **kwargs):
        aiohttp.web.run_app(self.app, **kwargs)

api = AnvilWorkerAPI()
api.run(port=5001)

Expected behavior

HTTP Requests are decoded correctly

Logs/tracebacks

======== Running on http://0.0.0.0:5001 ========
(Press CTRL+C to quit)
/api/v2/worker/register b''
Error handling request
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/aiohttp/web_protocol.py", line 350, in data_received
    messages, upgraded, tail = self._request_parser.feed_data(data)
  File "aiohttp/_http_parser.pyx", line 557, in aiohttp._http_parser.HttpParser.feed_data
aiohttp.http_exceptions.BadStatusLine: 400, message:
  Invalid method encountered:

    b'{"name":"worker 418"}'
      ^
/api/v2/worker/fetch b''
Error handling request
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/aiohttp/web_protocol.py", line 350, in data_received
    messages, upgraded, tail = self._request_parser.feed_data(data)
  File "aiohttp/_http_parser.pyx", line 557, in aiohttp._http_parser.HttpParser.feed_data
aiohttp.http_exceptions.BadStatusLine: 400, message:
  Invalid method encountered:

    b'{"id":null,"status":"IDLE","logs":"10:13:09 INFO : WorkerClient starting\\n10:13:09 INFO : Connecting to backend...\\n10:13:09 INFO : Connected\\n"}'
      ^

### Python Version

```console
$ python --version

aiohttp Version

$ python -m pip show aiohttp
Name: aiohttp
Version: 3.9.5
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author: None
Author-email: None
License: Apache 2
Location: /usr/local/lib/python3.9/dist-packages
Requires: yarl, multidict, async-timeout, frozenlist, attrs, aiosignal
Required-by:

multidict Version

$ python -m pip show multidict
Name: multidict
Version: 6.0.5
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache 2
Location: /usr/local/lib/python3.9/dist-packages
Requires: 
Required-by: yarl, aiohttp

yarl Version

$ python -m pip show yarl
Name: yarl
Version: 1.9.4
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache-2.0
Location: /usr/local/lib/python3.9/dist-packages
Requires: idna, multidict
Required-by: aiohttp

OS

Reproducible on both Arch Linux and openjdk:11 docker image (debian based)

Related component

Server

Additional context

Packet capture: test.pcap.gz

Code of Conduct

Dreamsorcerer commented 5 months ago

Would be better if we can add a test to test_parser.py reproducing the problem. Also, does it work without C parser (AIOHTTP_NO_EXTENSIONS=1)?

BlobbyBob commented 5 months ago

With AIOHTTP_NO_EXTENSIONS=1 there seems to be no issue

Dreamsorcerer commented 5 months ago

Then it likely needs a bug report at https://github.com/nodejs/llhttp/ with the actual message that trips the parser (they will need to reproduce it in isolation, without aiohttp or tlsanvil).

BlobbyBob commented 5 months ago

I've narrowed the issue a bit down. It has to do with the Upgrade: h2c header in the request. A more minimal reproduction example:

Server

import aiohttp.web

async def printer(request):
    print(request.path, await request.read())
    return aiohttp.web.json_response({})

app = aiohttp.web.Application()
app.router.add_post("/", printer)

aiohttp.web.run_app(app, port=8000)

Client

import socket
import time

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 0)
sock.connect(("localhost", 8000))
sock.send(b"POST / HTTP/1.1\r\n"
          b"Connection: Upgrade\r\n"
          b"Content-Length: 2\r\n"
          b"Upgrade: h2c\r\n"
          b"Content-Type: application/json\r\n\r\n"
          b"{}")

print(sock.recv(1024))

sock.close()
BlobbyBob commented 5 months ago

If you still think it is an issue with llhttp, could you open an issue over there? I'm not familiar with that project, so the bug report would probably be not too precise and helpful.

doublex commented 5 months ago

We are faced with the same problem: http-body gets mangled. aiohttp 3.9.3 works.

Dreamsorcerer commented 5 months ago

I see the same result in 3.8. Test reproducer is:

def test_http_request_parser_upgrade(parser: Any) -> None:
    text = b"POST / HTTP/1.1\r\nConnection: Upgrade\r\nContent-Length: 2\r\nUpgrade: h2c\r\nContent-Type: application/json\r\n\r\n"
    msg = parser.feed_data(text)[0][0][0]
    assert parser.feed_data(b"{}") == ([], False, b"")

    assert msg.method == "POST"
    assert msg.path == "/"
    assert msg.url.path == "/"
    assert msg.version == (1, 1)
    assert not msg.should_close
    assert msg.compression is None
    assert msg.upgrade
    assert not msg.chunked

It only happens when the message is split over 2 feed_data() calls though.

Dreamsorcerer commented 2 months ago

@BlobbyBob Would you be able to try out #8597? You'll need to follow the instructions to build the C parser: https://github.com/aio-libs/aiohttp/blob/fix-fail-upgrade/vendor/README.rst

The issue seems to be caused by the condition we used to decide when the request has been upgraded.