signalwire / freeswitch

FreeSWITCH is a Software Defined Telecom Stack enabling the digital transformation from proprietary telecom switches to a versatile software implementation that runs on any commodity hardware. From a Raspberry PI to a multi-core server, FreeSWITCH can unlock the telecommunications potential of any device.
https://freeswitch.com/#getting-started
Other
3.54k stars 1.41k forks source link

FS sends Time=0 RTP packet after inbound packetloss #801

Open kpribic opened 4 years ago

kpribic commented 4 years ago

Describe the bug Scenario is A->FS->B. FS parks the call, then calls out to B, and B answers. When there is a little packet loss in incoming call then the outgoing call has RTP packets that have time=0 with a marker. Every one is followed by an RTP packet that has proper timer and also a marker. Example where packet seq 16328 in inbound stream 0x24A0E306 goes missing, and FS generates outbound packets with odd timestamps in stream 0x1733015E: image

This behavior correlates exactly with packet loss as noted in FS log. Screenshot above had this entry in log:

8d9b961e-ffa6-4475-8a16-41da5fb821bd 2020-08-10 13:19:31.947490 [ALERT] switch_rtp.c:1670 sofia/internal/Ronny65@326675.company.com Got: audio seq 16329 but expected: 16328 lost: 1

Every time there is a log entry like this for inbound stream, there is an RTP packet with time=0 in outbound. Other odd thing that I noticed is that there is no issue on reverse RTP stream, when there is packet loss on B->FS leg then there is no time=0 packet on FS->A leg. FS logs packet loss events, however doesn't generate time=0 RTP packet.

To Reproduce It's really hard to offer steps to reproduce because this happens in a specific environment. A and B sides in this scenario are webrtc clients behind Janus gateway. Here are conditions that allowed us to replicate behavior:

  1. A side was webrtc client running on Android, and B side chrome on windows or linux. Call is audio & video but there is no issue on video side.
  2. There must be some packetloss on A leg.
  3. FS must not transcode, codec on incoming leg must be passed through to oubound leg.
  4. All logs here are from opus test. When using g711 it also appears to have weird timestamps.
  5. Call must be longer for this to occur, 5-10mins.

Expected behavior I expected outbound RTP timestamps to monotonously rise, not get periodically reset to 0 and then to "proper" value.

Package version or git hash 1.10.2 Also replicated on 1.8.7.

Trace logs Freeswitch log: FS_log.zip

This a list of timestamps of all packets with time=0: LIstOfRTPTimestamps.txt

mjerris commented 4 years ago

please re-try this on latest release and if it is not fixed re-open this issue.

kpribic commented 4 years ago

Tested again, and I still get time=0 packets. @mjerris Please reopen the ticket, I don't think I can do it.

Tested with version:

FreeSWITCH Version 1.10.4-release~64bit (-release 64bit)

Behavior is the same as in previous releases: FS generates in this example SSRC=0x7B3D833A, keeps getting Time=0 packets). One difference is that there are no LOG entries notifying me of packetloss.

Packets with numbers 6484 and above are the issue:

$ tshark -r rtp5.pcap -t u "rtp.timestamp == 0"
4005   13:15:49 10.86.166.20 -> 10.86.164.113 RTP 101 PT=DynamicRTP-Type-111, SSRC=0x5A92182E, Seq=1, Time=0, Mark
4007   13:15:49 10.86.166.20 -> 10.86.164.113 RTP 769 PT=VP8, SSRC=0xC45D4C88, Seq=1, Time=0
4009   13:15:49 10.86.166.20 -> 10.86.164.113 RTP 769 PT=VP8, SSRC=0xC45D4C88, Seq=2, Time=0, Mark
4080   13:15:49 10.86.166.20 -> 10.86.164.113 RTP 117 PT=opus, SSRC=0x8604C20B, Seq=1, Time=0, Mark
4107   13:15:50 10.86.166.20 -> 10.86.164.113 RTP 614 PT=VP8, SSRC=0xFE694ED7, Seq=1, Time=0, Mark
6484   13:15:56 10.86.164.113 -> 10.86.166.20 RTP 94 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=1966, Time=0, Mark
12044   13:16:08 10.86.164.113 -> 10.86.166.20 RTP 96 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=2581, Time=0, Mark
41096   13:17:04 10.86.164.113 -> 10.86.166.20 RTP 97 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=5345, Time=0, Mark
44307   13:17:10 10.86.164.113 -> 10.86.166.20 RTP 97 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=5652, Time=0, Mark
48067   13:17:17 10.86.164.113 -> 10.86.166.20 RTP 97 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=6010, Time=0, Mark
100484   13:18:56 10.86.164.113 -> 10.86.166.20 RTP 98 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=10980, Time=0, Mark
119393   13:19:32 10.86.164.113 -> 10.86.166.20 RTP 94 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=12771, Time=0, Mark
132159   13:19:57 10.86.164.113 -> 10.86.166.20 RTP 95 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=13999, Time=0, Mark
145633   13:20:22 10.86.164.113 -> 10.86.166.20 RTP 95 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=15279, Time=0, Mark
165571   13:21:00 10.86.164.113 -> 10.86.166.20 RTP 94 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=17174, Time=0, Mark
181615   13:21:31 10.86.164.113 -> 10.86.166.20 RTP 96 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=18710, Time=0, Mark
194418   13:21:56 10.86.164.113 -> 10.86.166.20 RTP 96 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=19939, Time=0, Mark
216786   13:22:39 10.86.164.113 -> 10.86.166.20 RTP 99 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=22081, Time=0, Mark
266067   13:24:13 10.86.164.113 -> 10.86.166.20 RTP 98 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=26796, Time=0, Mark
268759   13:24:18 10.86.164.113 -> 10.86.166.20 RTP 95 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=27052, Time=0, Mark
317111   13:25:50 10.86.164.113 -> 10.86.166.20 RTP 94 PT=DynamicRTP-Type-102, SSRC=0x7B3D833A, Seq=31666, Time=0, Mark

This is the list of packet numbers of every packet that had "Wrong sequence number" in Wireshark analysis, meaning that it followed packetloss, in inbound stream 0x5a92182e. You may note that every packet number listed here is followed by RTP packet with time=0 in the outbound stream in the list above.

6460
12027
41090
44297
48053
100459
119376
132126
145602
165540
181594
194384
216768
266032
268717
317085

It looks like this in wireshark for one lost packet image

dragos-oancea commented 3 years ago

did you try latest FS master with this ? there's a change that might fix it.