Haivision / srt

Secure, Reliable, Transport
https://www.srtalliance.org
Mozilla Public License 2.0
3.1k stars 846 forks source link

out-of-band LOSSREPORT #2074

Open ISSchuster opened 3 years ago

ISSchuster commented 3 years ago

Latest SRT Version - follwing Error

SRT target connected 20:06:57.969252/SRT:RcvQ:w4!W:SRT.in: @377310145:rcv LOSSREPORT rng 2147483647 - -819141302 with last sent 1328342364 - DISCARDING 20:06:57.970365/SRT:RcvQ:w4!W:SRT.in: @377310145:out-of-band LOSSREPORT received; BUG or ATTACK - last sent %1328342364 vs loss %-819141302

It happens with only one customer / target.

srt-live-transmit udp://@239.99.99.1:2137 srt://xxx.xxx.xxx.xxx:4301

It's working for about 1-2 hours - then it brakes up. What can cause this problem - and how can I solve it?

Ubuntu 20.04 with Docker - 4 other streams are working - most with an older version of SRT...

I'm now testing the older version 1.4.2 - I will update this report in a few hours.

ISSchuster commented 3 years ago

Update:

Same problem with 1.4.2 - but without an error message.

Stream stopped after about 20 minutes.

It doesn't reconnect - have to restart SRT.

Media path: 'udp://@239.99.99.1:2137' --> 'srt://xxx.xxx.xxx.xxx:4301' SRT target connected 11844 bytes lost, 2220663144 bytes sent, 2220674988 bytes received 5975956 bytes lost, 2220663144 bytes sent, 2226639100 bytes received 12229588 bytes lost, 2220663144 bytes sent, 2232892732 bytes received 18470060 bytes lost, 2220663144 bytes sent, 2239133204 bytes received 24723692 bytes lost, 2220663144 bytes sent, 2245386836 bytes received 30976008 bytes lost, 2220663144 bytes sent, 2251639152 bytes received 37221744 bytes lost, 2220663144 bytes sent, 2257884888 bytes received 43472744 bytes lost, 2220663144 bytes sent, 2264135888 bytes received 49726376 bytes lost, 2220663144 bytes sent, 2270389520 bytes received 55978692 bytes lost, 2220663144 bytes sent, 2276641836 bytes received 62221796 bytes lost, 2220663144 bytes sent, 2282884940 bytes received 68475428 bytes lost, 2220663144 bytes sent, 2289138572 bytes received 74730376 bytes lost, 2220663144 bytes sent, 2295393520 bytes received 80970848 bytes lost, 2220663144 bytes sent, 2301633992 bytes received 87223164 bytes lost, 2220663144 bytes sent, 2307886308 bytes received

DevSysEngineer commented 3 years ago

I think that SRT team need little bit more details before there can help. Based on experience that I have with SRT, it helps to use the debug modus and store any stats. Based on the logs i can see if SRT is resending packages etc and help you with debugging.

ISSchuster commented 3 years ago

Tried it with: srt-live-transmit -logfa:all -loglevel:error -logfile:"/tmp/error1.log" udp://@239.99.99.1:2137 srt://xxx.xxx.xxx.xxx:4301

The logfile gets created but it stays empty. So unfortunately I can't provide more information.

Will now try to find out what happens if we don't push the stream...

maxsharabayko commented 3 years ago

@ISSchuster A network capture would be helpful here. SRT receiver is also of the latest version, right? Do you apply any special network impairments? What kind of network is used? I wonder if this could be reproduced on our side somehow.

The warning message says a loss of packets with sequence numbers in the range [2147483647; -819141302] is reported. Those sequence numbers are not correct, because a sequence number must be a non-negative signed 32-bit integer. Here both values are likely negative. If the most significant bit of the Lost packet sequence number is set, then it identifies the range of sequence.

So far it looks like the receiver has some issues and sends invalid loss reports.

ethouris commented 3 years ago

@ISSchuster: Also as per logging options:

The debug might give us at least idea in what exactly circumstances it has happened. There was some bug in an older version around this - both too blidly interpretation of the lossreport (you can see it prevented in 1.4.3), and wrong generation of the lossreport. The sequence number you showed in the warning logs suggest some very weird screwup of the sequence numbers, therefore the full available log info might give us better idea where this screwup happened - both from the sender and the receiver (sender is even more important here).

ISSchuster commented 3 years ago

Im still waiting for some information regarding the receiver. It's a television playout - hope I get some more information soon. I'm using the latest SRT-Version.

Also tried to create a debug logfile. But there is not really more in it on my side.

Hope the customer will do the same - and then we can probably provide more information.

All I got is:

12:53:37.044345/srt-live-transm D:SRT.sm: generateSocketID: : @85995341 12:53:37.044461/srt-live-transm D:SRT.km: CHANNEL: Bound to local address: [SERVER]:4301 12:53:37.083768/SRT:RcvQ:w1.N:SRT.cn: PASSING request from: [CLIENT]:36169 to agent:85995341 12:53:37.083862/SRT:RcvQ:w1.N:SRT.cn: Listener managed the connection request from: [CLIENT]:36169 result:waveahand 12:53:37.101714/SRT:RcvQ:w1.N:SRT.cn: PASSING request from: [CLIENT]:36169 to agent:85995341 12:53:37.101744/SRT:RcvQ:w1 D:SRT.sm: generateSocketID: : @85995340 12:53:37.102956/SRT:RcvQ:w1.N:SRT.cn: listen ret: -1 - conclusion 12:53:37.102966/SRT:RcvQ:w1.N:SRT.cn: Listener managed the connection request from: [CLIENT]:36169 result:waveahand 12:53:37.121101/SRT:RcvQ:w1.N:SRT.cn: sendSrtMsg: cmd=3(KMREQ) len=56 KmState: SND=SECURING RCV=UNSECURED 12:53:37.141490/SRT:RcvQ:w1.N:SRT.cn: processSrtMsg_KMRSP: cmd=4(KMRSP) len=224 KmState: SND=SECURED RCV=SECURED 12:55:59.790630/SRT:RcvQ:w1!W:SRT.in: @85995340:rcv LOSSREPORT rng 773760549 - -1 with last sent 773760577 - DISCARDING 12:55:59.790690/SRT:RcvQ:w1!W:SRT.in: @85995340:out-of-band LOSSREPORT received; BUG or ATTACK - last sent %773760577 vs loss %-1

maxsharabayko commented 3 years ago

SRT receiver version also can be found out from the network capture on the sender side (from the handshake response from the peer). Would you mind collecting a network capture?

ISSchuster commented 3 years ago

@maxsharabayko

The receiver is a Haivision Media Gateway, Version 1.4.1

maxsharabayko commented 3 years ago

@ISSchuster I would still kindly insist on a network capture if possible.

The payload is secure if encryption is enabled. We do not need the payload it to analyze this issue. If you would like to secure public IPs, utilities like pcap-sanitizer should help.