Haivision / srt

Secure, Reliable, Transport
https://www.srtalliance.org
Mozilla Public License 2.0
3.1k stars 849 forks source link

[BUG]ACKD_RCVSPEED,ACKD_BANDWIDTH,ACKD_RCVRATE are not accurate #1118

Open runner365 opened 4 years ago

runner365 commented 4 years ago

Describe the bug I want to build svc encoder based on network quality which can be estimated by srt receiver(server) recv_bitrate. I want to get rcv_bitrate of srt receiver on encoder(srt sender), and encoder can ajust encode bitrate by rcv_bitrate of srt receiver.


                      mpegts over srt
encoder --------------------------->srt server
      |                        (caculate recv bitrate)
      |                              |
  recv full ack <-------------------send full ack which has rcv bytes/s.

Firstly, I use ffmpeg to push srt live stream to srt live server which is built by srt api.

 ffmpeg -re -i /mnt/movie/109351.1.mp4 -c copy -f mpegts 'srt://127.0.0.1:10080?streamid=#!::h=live/livestream,m=publish'

In ffmpeg libsrt.c, I use srt_bstats to get SRT_TRACEBSTATS which has mbpsBandwidth. I find mbpsBandwidth is much larger than the real upload bitrate. I can't adjust encode bitrate as mbpsBandwidth which is much larger.

so, I think I can use recv packages/s and bytes/s in ack full message which send by the srt receiver.

    ACKD_RCVSPEED = 4,   // length would be 16
    ACKD_BANDWIDTH = 5,
    ACKD_RCVRATE = 6,

so, I add some printf code in core.c:

        if (acksize > ACKD_TOTAL_SIZE_UDTBASE) {
            bytesps = ackdata[ACKD_RCVRATE];
            if (pktps != 0) {
                printf("rcv1 ack ACKD_RCVSPEED(pkg/s):%d, ACKD_BANDWIDTH(pkgs/s):%d, ACKD_RCVRATE(bytes/s):%d.\r\n",
                    pktps, bandwidth, bytesps);
            }
        }

the printf log:

rcv1 ack ACKD_RCVSPEED(pkg/s):3125, ACKD_BANDWIDTH(pkgs/s):965, ACKD_RCVRATE(bytes/s):3659861.
rcv1 ack ACKD_RCVSPEED(pkg/s):3125, ACKD_BANDWIDTH(pkgs/s):965, ACKD_RCVRATE(bytes/s):3659861.
rcv1 ack ACKD_RCVSPEED(pkg/s):5465, ACKD_BANDWIDTH(pkgs/s):978, ACKD_RCVRATE(bytes/s):6318762.
rcv1 ack ACKD_RCVSPEED(pkg/s):6494, ACKD_BANDWIDTH(pkgs/s):978, ACKD_RCVRATE(bytes/s):7605900.
rcv1 ack ACKD_RCVSPEED(pkg/s):6494, ACKD_BANDWIDTH(pkgs/s):978, ACKD_RCVRATE(bytes/s):7605900.
rcv1 ack ACKD_RCVSPEED(pkg/s):6494, ACKD_BANDWIDTH(pkgs/s):978, ACKD_RCVRATE(bytes/s):7605900.
rcv1 ack ACKD_RCVSPEED(pkg/s):2933, ACKD_BANDWIDTH(pkgs/s):978, ACKD_RCVRATE(bytes/s):3429636.
rcv1 ack ACKD_RCVSPEED(pkg/s):2933, ACKD_BANDWIDTH(pkgs/s):978, ACKD_RCVRATE(bytes/s):3429636.
rcv1 ack ACKD_RCVSPEED(pkg/s):2434, ACKD_BANDWIDTH(pkgs/s):978, ACKD_RCVRATE(bytes/s):2709083.

ACKD_RCVRATE(bytes/s) is huge, for example ACKD_RCVRATE(bytes/s):2709083. It meams 2709083*8= 21.672664mbps but its read encoder bitrate is less than 1mbps.

And ACKD_BANDWIDTH(pkgs/s):978 is also much larger. From the document and code, ACKD_BANDWIDTH is caculated by packet probe in srt receiver side, but I think the result is not right.

I think srt is a good protocal which can help to make svc encoder if the estimate bitrate is accurate. Please help.

Best regard,

maxsharabayko commented 4 years ago

Hi @runner365 Thanks for sharing. That is a known issue we are working on. You might also be interested in some discussion in #1096.

At the moment we have some ideas on how to improve it. We are working on tuning and validating the algorithm.

Regarding the ACKD_RCVRATE, it is not meant to be used in live mode. It was designed solely for file transfer mode. We also have plans to make use of it in the live mode.

Don't expect fast resolution here though. Accurate bandwidth estimation is a deep and highly empirical problem.

runner365 commented 4 years ago

Hi @maxsharabayko Thanks for reply. After reading srt code, I think we can use bbr congestion algorithem to estimate the bitrate state(increase, decrease or keep). In the SRT_TRACEBSTATS struct, I can use msRTT, pktFlightSize(infight), mbpsSendRate to estimate the bitrate state(increase, decrease or keep).

total data in flight is equal to the BDP(bandwidth-delay product)


BDP = send_bitrate_max*rtt_min;
if (BDP > inflight) {
    increate encoder bitrate
} else if (BDP < inflight) {
    decrease encoder bitrate
}

I have already coded it in ffmpeg which support srt adaptive bitrate encoder. github address: https://github.com/runner365/srt_encoder

We have supported srt live server in srs which supported rtmp/hls/rtp live protocol.

please help to impove the the bbr algorithem I code in ffmpeg if you find better solution or bugs.

Best Regard.

maxsharabayko commented 4 years ago

Hi @runner365 BBR is something we would like to add. SRT library offers a plugin structure for Congestion Control (see this structure). It would be great if you could add BBR as a PR to SRT. There are some tricky parts with RTT values that are not instant measurements, but a moving average, and so on. But still, this is something worth to work on.

ethouris commented 4 years ago

I've been doing some experiments with the description of the BBR algorithm and I tried also some ideas of my own. Not sure where the guys from Google got their research results from, but I have conducted some of the experiments with measurements of the network factors under various congestion controlling conditions and my results differ significantly.

First, BBR is based on a statement that the potential packet loss will happen at the moment when RTT significantly increases. So the optimal sending speed is below the limit where RTT starts to increase. This fact alone - even skipping the fact that I had to add enhancements to SRT to implement statistics properly and collect similar data as TCP does - proved to show only part of the truth. In fact, RTT is extremely low in the beginning and when you exchange kinda 4 packets for the first 500ms, it's not increasing. The situation starts to change once you send the data packets with a "usual" data speed (not even the maximum speed that your first-hand machine can do). RTT this time starts to rise horribly and very quickly, which is normal, bearable, and you shouldn't treat this too seriously. Only after some time - kinda 1 second, but you can in most cases simply wait for the first loss report - can you first slowdown to get orientation of how much of the network buffers was actually swollen, and what RTT should be when it's running stable speed. Only once you achieve stability (treating the initial part as a measurement period) can you probe the network for possibly being able to press for a better speed.

Important part of BBR is the fact that it is described basing on the "flight window". SRT does have this measurement, but it's kinda minor, and it also has much better tools to measure the network conditions, such as reception speed. In this case it shouldn't be forgotten that the size of the flight window is a factor on which lots of various conditions have influence, of which of course each one has a source in the current network capacity, but their varying influence causes that sniffing the current network capacity out of it is extremely hard. The network capacity actually also varies, results need to be "averaged", but only in some narrow range, and many phenomena take time to show up.

The best possibility for network-adaptive stream I would rather point as something that you should possibly be able to send with the maximum wanted bitrate, while you might want to use some bitrate that should be also ensured as a minimum. You start with the maximum so that you can quickly hit the ceiling in case you don't currently have network capacity for that stream, then the capacity measurement should feed back the results back to the application and inform it what the bitrate is recommended now that the capacity is lower than required. Then, after some period of stability, the bitrate can be slightly increased. The biggest problem here would be how to pick up the best "highest" bitrate in case when you don't know your network conditions on particular link at all - probably this would have to be probed first.

runner365 commented 4 years ago

Hi @runner365 BBR is something we would like to add. SRT library offers a plugin structure for Congestion Control (see this structure). It would be great if you could add BBR as a PR to SRT. There are some tricky parts with RTT values that are not instant measurements, but a moving average, and so on. But still, this is something worth to work on.

@maxsharabayko Thanks for your response. I start reading SrtCongestion code in congctl.cpp/h. And RTT is the average RTT. const int rtt = ackdata[ACKD_RTT]; m_iRTT = avg_iir<8>(m_iRTT, rtt); The FULL ACK is send in 10ms interval. Average RTT in 80ms(8x10ms) is ok for the algorithom.

I use the ACKD_RCVSPEED in ACK as maxBW. This need to be improved.

The inflight is got by: pktFlightSize = CSeqNo::seqlen(m_iSndLastAck,CSeqNo::incseq(m_iSndCurrSeqNo)) - 1; This doesn't include the retransmit packets. It's better to include the retransmit packet count in the inflight.

maxsharabayko commented 4 years ago

@runner365 Cool! Would be cool to get your contribution on this of any kind! The following article might be useful for you: link. Also please feel free to join our slack channel for more detailed discussions: invitation link. There is a dedicated "congestion_control" channel.

The inflight is got by: pktFlightSize = CSeqNo::seqlen(m_iSndLastAck,CSeqNo::incseq(m_iSndCurrSeqNo)) - 1; This doesn't include the retransmit packets. It's better to include the retransmit packet count in the inflight.

Only unacknowledged packets can be retransmitted. The receiver can't acknowledge a packet, if there are missing packets prior to it. m_iSndLastAck - latest acknowledged packet. m_iSndCurrSeqNo - latest sent packet (original, not retransmitted). The difference shows the number of packets that were sent, but the status of which is still unknown.

E.g. 1 2 3 x 5 6 7, where n is a received packet number, and x is a lost packet, then only the first three packets can be acknowledged untill the missing forth packet reaches the receiver.

const int rtt = ackdata[ACKD_RTT]; m_iRTT = avg_iir<8>(m_iRTT, rtt);

Note that ackdata[ACKD_RTT] was also averaged by the receiver before sending it. Issue #782 addresses this.

runner365 commented 4 years ago

@maxsharabayko Thanks for your help. Now I got you. Best Regard.

ethouris commented 4 years ago

The "flight window" is defined - well, theoretically (but same for TCP) - as a distance between the packet being received and being sent at the same moment, in sequence numbers; the problem here though is that it's not measureable in this form, and therefore it's not exactly true even for TCP. In TCP it's simply measured at the moment of acknowledgement, which is believed to be sent back immediately when particular packet has arrived, so then at the sender side it is confronted with the sequence number of the packet about to be sent and that's how the flight window is defined. If you have lost packets then simply your window will stretch dramatically and in result you'll have to slowdown sending (please note that BBR description bases on the flight window size, not on the sending speed).

The correct way to measure the flight window is to confront the sequence number of the packet that was really the last received at the moment of sending ACK (which need not be the same as the ACK sequence), although the receiver speed should be still measured as per valid packets received (meaning, all belated or double-retransmitted packets should be skipped, but retransmitted packets should be taken into account). Still, the current flight window measurement is thought as for the file transmission, so it should behave similarly as for TCP, even though the retransmission in case of SRT is more efficient.

It actually depends on what you want to measure and what factors you want to take into account and how. The distance between the ACK-received packet sequence and being-sent packet sequence is the "real flight window", that is, the number of packets that have departed from the source, but didn't reach the target yet. This also isn't exactly accurate because for the ACK packet it also takes some time to travel, so the size of the flight could be overestimated. But then, it depends on what you want to use this information for. Because for BBR this value (although including the extension due to packet loss) is used to control the sending and to slow down if this size is too big. This value could be at best thought of as a control of the current flow, together with the number of lost packets to see how close the current speed is towards the momentary network capacity. It's likely that if the number of lost packets grows, so will the real flight window. This way, the real flight window can give you idea of what to expect when you slow down sending (the real flight window should go hand-in-hand with sending speed when you experience no losses).

I have conducted still way too little experiments to have something reasonable to improve the congestion control algorithm. On my branch (in my replica of this repo), def-fag I have added some extra statistics, also the last received sequence with ACK and the "receiver velocity" (very short distance averaged receiver speed), however the code there is highly unstable and was reported to make memory override. You can look into this as to what kind of extra statistics are being retrieved, if you'd like to play around with it.