pallets / quart

An async Python micro framework for building web applications.
https://quart.palletsprojects.com
MIT License
2.96k stars 160 forks source link

Add streaming backpressure #152

Open pgjones opened 2 years ago

pgjones commented 2 years ago

Some more context when this is relevant.

flowchart LR
    httpx-->|slow network| server
      subgraph one[Host 1]
        client-->|localhost\nconnection| quart
        subgraph quart[Quart app]
          httpx[httpx async\nclient]
        end
      end
      subgraph two[Host 2]
        server
      end

In this configuration, the Quart app acts as a proxy. As the httpx client is slowly consuming data because of the network speed, Quart buffers incoming data which is being uploaded at a very high speed because of a localhost connection. For huge payloads, this results in either memory allocation errors, or the out-of-memory killer killing the app.

^ From @andrewsh

laggardkernel commented 2 years ago

Doesn't look like a problem for Quart, the ASGI app.

The ASGI separates ASGI server and ASGI app. All the ASGI app gets is scope, receive(), send(). An app reads request data and returns a response. The ASGI server handle connection and data reading, sending.

In my understanding, "adding streaming backpressue" means calling transp.pause_reading() in ASGI app. This seems contradict the ASGI abstration above.

From what I've read,

In asyncio, transp defines pause_reading(), resume_reading() and called by proto. Proto defines pause_writing(), resume_writing() and called by transp. It's because from the view of an ASGI app, the app read req data from transp, and resp data is sent from the app.

In ASGI servers like uvicorn, hypercorn, they add more layers under protocol. e.g. RequestResponseCycle (wrapping around await app()) in uvicorn. HTTPStream, Context.spawn_app() in hypercorn. Accessibility of transp.pause_reading(), resume_reading() should be extended into these new layers in ASGI servers. Besides, .pause_writing(), resume_writing() should be decided by the new most inner layer but not the proto.

uvicorn introduces a FlowControl layer to share read, write availability between proto and RequestResponseCycle. proto pauses reading if read buffer RequestResponseCycle.body > HIGH_WATER_LIMIT, Onece receive() gets called by the ASGI app, it resumes reading.

The backpressure thing sounds like an ASGI server thing.

pgjones commented 2 years ago

It is both. At the moment there is no way for the app to place back pressure on the server. In the ASGI setup this could be done, for example, by not awaiting receive whilst the app catches up.

lordmauve commented 1 year ago

Is this about streaming uploads?

I'm seeing 413 errors due to hitting MAX_CONTENT_LENGTH, even though I'm streaming the body to disk. The docs say

This allows larger bodies to be received and consumed if desired. The key being that the data is consumed and not otherwise stored in memory. An example is,

async def route():
   async for data in request.body:
       # Do something with the data
       ...

it is advisable to add a timeout within each chunk if streaming the request.

But this is not working, and it's clear that Body.append() is sync and simply appends to the buffer and then fails with HTTP 413 if MAX_CONTENT_LENGTH is now exceeded. Because it is sync it cannot block if the buffer is "full" so it doesn't transmit backpressure from the application code.

The backpressure would need to be applied here at the ASGI layer such that if the request body buffer is not draining then we don't accept a new ASGI message on this connection.