private-octopus / picoquic

Minimal implementation of the QUIC protocol
MIT License
523 stars 153 forks source link

Long period with low CWND on WiFi #1697

Open huitema opened 1 month ago

huitema commented 1 month ago

On stationary Wi-Fi, when testing high defintion real time video, we sometimes observe successions of Wi-Fi suspension, followed by a long period of low bandwidth transmission.

image

The series of yellow vertical bars in the RTT graph correspond to "end of suspension" behavior. All the packets that were queued during a Wi-Fi suspension are delivered at once, and acknowledged at very short intervals. We see 19 bars like that between t=11,000 and T=14,150, average duration of interval 166ms. After T=12,000, the RTT is always about 140ms.

When, the RTT drops to 37ms, packets are acked rapidly. There is an exponential process going on, and the bandwidth is restored at T=15535. Interestingly, it takes 37 RTT to increase the bandwidth by a factor 11. That seems to correspond more or less to 3-4 RTT for a 25% increase, which could be a cycle of probe BW Down, Cruise (half the cycles), Refill and Up. Much slower than startup, which would have restored the BW in just 3 or 4 RTT.

The slow "probeBW_UP" behavior does not really correspond to the probeBW_UP spec. In theory, BBR should remain in "probeBW_UP" state as long as the measured bandwidth is growing, and should exit the state if the bandwidth has not been growing for 3 successive RTT. But this is an area were the slide deck presenting BBR differs from the draft. We probably need to revise that part of the code.

The drop in bandwidth after about 10 suspensions is probably spurious. The issue would not have appeared if the 10th and 11th suspension had been processed like the previous ones. This needs to be investigated, understood and fixed.

huitema commented 1 month ago

Additional tests point to a couple of issues with the BBR "Probe BW UP" behavior: