philippemnoel / whist-whist

Whist Browser
https://whist.com
Apache License 2.0
8 stars 3 forks source link

Consider control retransmission bandwidth instead of NACK bitrate #5582

Open wangyu- opened 2 years ago

wangyu- commented 2 years ago

Feature Description

At the moment, server retransmits a segment/packet whenever it receives a NACK from client. That is out of control of the bitrate algrithm.

The only control of retransmission we have, is to control NACK bitrate at client side. Previous that's MAX_NACK_AVG_MBPS andMAX_NACK_BURST_MBPS. That's hand tuned, doesn't adapt to the network. Later it was changed to MAX_NACK_BITRATE_RATIO it becomes a bit better, since it somehow ties to the bitrate algorithm.

But there is still the problems:

1) when there are loss/retranssmion, the server sends signatifcantly more packets than when there are no. 2) we dare not to tune MAX_NACK_BITRATE_RATIO too high, since retransmission will stress the bandwith and affect the tranmission of normal packet. we dare not to tune MAX_NACK_BITRATE_RATIO too low, since that will lead to not enough bandwith for retransmission.

Suggestion:

1) don't retransmist a segment/packet whenever it receives a NACK from client, do retransmit when bandwith allows 2) control the retransmission bandwidh instead of the NACK bitrate. 3) make both the orignal packet and the retransmisison under the limit of bitrate. (don't let birate control only the original packets, let bitrate control both) 4) when bandwith is not enough for all the original packets + retransmission packets, let server make "smart" decision based on the priority of packet.

examples of the smart decisions:

1) guarantee the tranmission of original packet first, do retranmission only if there are left bandwidth. 2) when the bandwidth budget is tight, retransmit newer packets with higher priority than old packet. 3) retransmit important frames (iframe?) with a higher priority than normal packet 4) retransmist packets that seems to be recover easily with higher priority 5) don't retransmis a packet that has an id lower than a i-frame/ltr-frame that is confirmed to be received by client. (Need help with ACK)

The Problem It Fixes

Advantage:

  1. more precies bandwith control
  2. only need the bitrate, other than bitrate + NACK bitrate, no hand tuning of NACK bitrate

Disadvantage:

  1. complexity
  2. might need to hold retransmission for a few ms, to wait for decision.
npip99 commented 2 years ago

we dare not to tune MAX_NACK_BITRATE_RATIO too high, since retransmission will stress the bandwith and affect the tranmission of normal packet.

Responding to this, when NACKs are sent, this is included in the bitrate throttling. The bandwidth of these NACKs will simply reduce framerate, which is intended and desired. (Since the ratio is 50%, this will lower framerate to worst-case 30FPS during periods of congestion, which is ideal). NACKs are always higher priority than new frames, which is important, since the client can't even catch up until those NACKs are addressed. (Unless a stream reset happens, #5747 is the tracking issue for that)

npip99 commented 2 years ago

An additional potential optimization on this topic:

WebRTC appears to track the "Average % bandwidth used for NACKs", and sets serverside video bandwidth to "Make room" for these expected NACKs. For example, if 20% of bandwidth was used on NACKs in a given 1000ms interval, WebRTC will set the serverside to only use 80% of the total bandwidth for the next 1000ms, out of expectation that 20% will end up being used on NACKs. This improves video quality during the time that WCC takes to figure out what the actual bandwidth is. We probably want to integrate this into our code as well, since the concept fits well with our algorithms and methods.

wangyu- commented 2 years ago

Responding to this, when NACKs are sent, this is included in the bitrate throttling. The bandwidth of these NACKs will simply reduce framerate, which is intended and desired. (Since the ratio is 50%, this will lower framerate to worst-case 30FPS during periods of congestion, which is ideal).

I checked the related code, yeah, I believe we are doing this correctly.