Better algorithm for stream flow control

mozilla / neqo

Neqo, the Mozilla Firefox implementation of QUIC in Rust

https://firefox-source-docs.mozilla.org/networking/http/http3.html

Apache License 2.0

1.84k stars 124 forks source link

Better algorithm for stream flow control #733

Open agrover opened 4 years ago

agrover commented 4 years ago

What we have now is the thing we did that was fast to implement, rather than a fully thought-out implementation. We should improve our algorithm, when possible.

from https://github.com/mozilla/neqo/pull/726#discussion_r440584070:

We have an estimate of BDP we might use to tune this. We should aim for BDP*2 at a minimum if we have an algorithm like this, otherwise we will need more frequent sending of updates.

mxinden commented 7 months ago

Status quo neqo

Static connection flow-control limit:

https://github.com/mozilla/neqo/blob/4fc4d16744ede5b099dd5bb4bfd5922d07deb046/neqo-transport/src/connection/params.rs#L21

Static stream receive window limit:

https://github.com/mozilla/neqo/blob/c004359a817ffdea33394e94944d2f882e7e78af/neqo-transport/src/recv_stream.rs#L33

Static stream send window limit:

https://github.com/mozilla/neqo/blob/c004359a817ffdea33394e94944d2f882e7e78af/neqo-transport/src/send_stream.rs#L36

Other implementations

Potential solution

Implement auto-tuning based on Google's QUIC design document:

Algorithm

As above, the flow control window update triggered when: available window < max window size / 2, where available window = max receive window offset - bytes consumed

to realize auto-tuning, add the following logic just before issuing a window update

keep track of time interval between subsequent window updates, call this since_last_update.

if ( since_last_update < RTT trigger factor ) then max window size = MIN(max window size increase factor, upper bound).

trigger factor is 2

increase factor is 2

As above, window update sets: max_received window offset += (max window size - available window)

https://docs.google.com/document/d/1F2YfdDXKpy20WVKJueEf4abn_LVZHhMUMS5gX6Pgjl4/edit#heading=h.j3nq4bn2eaud

Used in quic-go.

Also I implemented this for a muxer on top of TCP in the past https://github.com/libp2p/rust-yamux/pull/176.

Alternatives

More research needed. Pointers very much appreciated.

martinthomson commented 6 months ago

I believe that a better approach would be not to double the window, but to increase the window by an amount equal to the amount consumed since the last increase if the time taken to consume the windows is less than one RTT (increased by a small fudge factor to allow slack for reporting delays). This is similar to the increase in Reno CC. The effect then would be that the window increases less rapidly and maybe more often, but it would end up with a commitment that is closer to a 2x factor of BDP, rather than being 4x or more.

mxinden commented 6 months ago

I believe that a better approach would be not to double the window, but to increase the window by an amount equal to the amount consumed since the last increase if the time taken to consume the windows is less than one RTT

Given that the premise ("if the time taken to consume the windows is less than one RTT") requires the window to be fully consumed, the "amount consumed" will be equal to the window, thereby the increase will be equal to the window, and thus the new window will be twice as large as the current window. How is that different to doubling the window as proposed above?

I might be confusing (a) the increase of the available window and (b) the increase of the maximum window size above.

@martinthomson would you mind rephrasing your proposal above, with the terminology used in fc.rs. For what it is worth, here is the implementation of the Google QUIC algorithm:

https://github.com/mozilla/neqo/blob/0ad1b77f9483568570c99a04fbdb4329ff6f50c9/neqo-transport/src/fc.rs#L393-L405

martinthomson commented 6 months ago

Sure. The goal is to increase the value of self.max_active based on perceived changes in the BDP, as observed by the throughput on the affected stream.

Therefore, I suggest that if the rate at which self.retired increases (that is, the change in that value, divided by the time elapsed) exceeds some function of self.max_active / path.rtt, then we can increase self.max_active by the amount that self.retired has increased.

This will approximately lead to a doubling as self.max_active determines how much data can be outstanding, but that doubling might happen on a shorter timescale than an RTT, meaning that rate increases could exceed one doubling per RTT. The same is true of what you proposed as well, just that this version will cap out at a lower value for self.max_active.

Note that we could increase only based on the amount by which the increase exceeds previous expectations, which would lead to a closer adaptation, but would be more reliant on ACK rate than the simple scheme.