ninenines / cowboy

Small, fast, modern HTTP server for Erlang/OTP.
https://ninenines.eu
ISC License
7.23k stars 1.17k forks source link

HTTP/2: to many stream window size updates after connection window size update #1643

Open RoadRunnr opened 3 months ago

RoadRunnr commented 3 months ago

Situation:

  1. A client opens a single connection and add lots of streams. Cowboys initial window size is set to 64k for the connection and also 64k for each stream. The client send a smallish request on each stream, about 1k each. The server does not process the requests (simple slow processing).

  2. Each request will consume 1k of the connection window size and 1k of its own stream window size. After 64k request the connection window size will be exhausted.

  3. (not sure if it matters), due to the varying requests sizes (the requests are only about 1k), the DATA frame of the request that consumes the last piece of the connection window, will be split into 2 DATA frames.

  4. Flow control does not apply to HEADER frames, the client can therefore continue to open new streams, but it can not send DATA frames on them since the connection window is exhausted. All those streams and requests are now stalled.

  5. At this point some request processing finishes and a few response are sent. The

  6. After the response, cowboy send a connection WINDOW_UPDATE with a very large window.

  7. This is then followed up by a WINDOW_UPDATE for every single stream that had send HEADER, but no DATA frames, yet.

That last step (7.), depending on the number of pending requests, cause are rather large and IMHO unnecessary storm of stream window updates. In all the pending streams, the stream windows had enough space left so that no change was needed and the connection window update should have been enough to allow them to continue.

essen commented 3 months ago

Do you have the WINDOW_UPDATE sizes that were sent, both for the connection and each individual streams?

RoadRunnr commented 3 months ago

Do you have the WINDOW_UPDATE sizes that were sent, both for the connection and each individual streams?

Does a PCAP help? The data comes from a load generator that fills it with pseudo random values, therefore no problem sharing it here.

cowboy-h2-window-update.pcapng.gz

The client in this case was gun with default settings.

NOTE: this particular PCAP also has the problem that it exceeded the max_received_frame_rate, but that should not be relevant to the issue.

essen commented 3 months ago

Yes that's perfect, thank you.

Considering the increment, I think it's just Cowboy handlers that have called read_body which defaults to 8MB read body size. Because 8MB is more than the current stream's window, Cowboy increases it preemptively to let the client upload without being blocked. Because many streams try to read bodies at the same time, Cowboy simply has to send many WINDOW_UPDATE frames.

Note that this only happens after the handlers have started calling read_body as Cowboy will not increase the window before to avoid being overloaded, as well as to avoid using resources if the body ends up discarded. This is why you end up with a split DATA frame in 3.

Cowboy does not loop over all streams sending a WINDOW_UPDATE, but each stream may independently send it. It appears weird because of the nature of the test.

In practice you may reduce the size read by read_body to avoid this. If you expect the body to be below, say, 10K, then set that as the length option and WINDOW_UPDATE frames will not be sent.