meetecho / janus-gateway

Janus WebRTC Server
https://janus.conf.meetecho.com
GNU General Public License v3.0
8.17k stars 2.47k forks source link

[1.x] When leaving ungracefully and rejoining, other users don't get any data from publisher #3187

Closed phwang4 closed 1 year ago

phwang4 commented 1 year ago

What version of Janus is this happening on? commit 5bdf33c (1.1.3)

Have you tested a more recent version of Janus too? It is already the latest

Was this working before? Yes, it was working with non-multistream janus in the past

Is there a gdb or libasan trace of the issue? This is the non-working janus log where user leaves ungracefully (they refresh the webpage instead of hitting the leave call button, which probably causes the browser to do some of its own webRTC cleanup)

This is the working janus log where user leaves the call normally by hitting the leave call button

Additional context When I create a videoroom, I have the organizer publish on the same id, and each successive user has their id incremented by one from the previous. Anyone who joins the videoroom will send a 'join' request as both a publisher and subscriber. The subscriber 'join' request will subscribe to all available publishers, and fill the rest of the slots with dummy publishers. For this case, organizer will subscribe to three dummy publishers when joining, and the other user will subscribe to organizer and two other dummy publishers when joining.

Once the other user joins, the organizer will do a 'switch' request on mid '0' (to replace the dummy with the new feed id). Similarly, when the user leaves, the organizer will do a 'switch' request on that same mid ('0') back to the dummy feed id

For the not working case: The videoroom is 195853389045 Line 1974: When the user publishing on 195853389046 leaves, organizer switches their feed back to a dummy publisher, and janus does a detach. Line 2062 is when the user rejoins with id 195853389047 Line 2903 is when user 195853389047's joining then gets reflected onto the organizer (195853389045), who then switches at line 2943 At this point, organizer cannot see the user who rejoined, and checking webrtc internals shows that there is no data being sent to organizer but the other user is actually publishing

For the working case: The videoroom is 195853389050 Line 1825: When the user publishing on 195853389051 leaves, organizer switches their feed back to a dummy publisher, and janus does a detach. Line 1927 is when the user rejoins with id 195853389052 Line 2782 is when user 195853389052's joining then gets reflected onto the organizer (195853389050 ), who then switches at line 2813 At this point, organizer can see the user who rejoined just fine

lminiero commented 1 year ago

Are you setting autoupdate: false when joining subscribers? Otherwise a user leaving will result in an SDP offer that disables the related m-line(s), and I suspect that may cause a race with your dummy publisher switches.

phwang4 commented 1 year ago

I just tried again with that change but it didn't seem to have an effect.

Here is the janus.log with the new change

The videoroom is 196447998255 Line 5433: When the user publishing on 196447998256 leaves, organizer switches their feed back to a dummy publisher, and janus does a detach. Line 5515 is when the user rejoins with id 195853389052 Line 7475 is when user 195853389052's joining then gets reflected onto the organizer (195853389050 ), who then switches at line 7513

Just for reference, these are some example messages for joining as subscriber, publisher, the publish message, and switch message respectively. The first three are from the user who rejoined, and the switch is the organizer switching to said user.

{"autoupdate":false,"ptype":"subscriber","request":"join","room":196447998255,"streams":[{"feed":196447998255},{"feed":7656148627985286},{"feed":7656148627985286},{"feed":7656148627985286},{"feed":7656148627985286},{"feed":7656148627985286},{"feed":7656148627985286},{"feed":7656148627985286},{"feed":7656148627985286},{"feed":7656148627985286},{"feed":7656148627985286},{"feed":7656148627985286},{"feed":7656148627985286},{"feed":7656148627985286},{"feed":7656148627985286}]}
{"id":196447998257,"ptype":"publisher","request":"join","room":196447998255}
{"body":{"audio":false,"bitrate":400000,"data":false,"request":"publish","video":true},"jsep":{"sdp":"v=0\r\no=- 1328808664441078350 2 IN IP4 127.0.0.1\r\ns=-\r\nt=0 0\r\na=group:BUNDLE 0\r\na=extmap-allow-mixed\r\na=msid-semantic: WMS c5a992b3-4548-4f66-b7cc-80e5ee9edc31\r\nm=video 9 UDP/TLS/RTP/SAVPF 96 97 102 103 104 105 106 107 108 109 127 125 39 40 45 46 98 99 100 101 112 113 114 115 116\r\nc=IN IP4 0.0.0.0\r\na=rtcp:9 IN IP4 0.0.0.0\r\na=ice-ufrag:TLcU\r\na=ice-pwd:9/8/BjyMlQBlJSV1dHR9N3wk\r\na=ice-options:trickle\r\na=fingerprint:sha-256 5E:87:7C:0F:CF:97:CE:EC:CE:CE:EE:99:43:7F:36:94:23:53:A3:6D:3D:5A:08:68:94:A4:FE:D3:2D:59:6F:76\r\na=setup:actpass\r\na=mid:0\r\na=extmap:1 urn:ietf:params:rtp-hdrext:toffset\r\na=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time\r\na=extmap:3 urn:3gpp:video-orientation\r\na=extmap:4 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01\r\na=extmap:5 http://www.webrtc.org/experiments/rtp-hdrext/playout-delay\r\na=extmap:6 http://www.webrtc.org/experiments/rtp-hdrext/video-content-type\r\na=extmap:7 http://www.webrtc.org/experiments/rtp-hdrext/video-timing\r\na=extmap:8 http://www.webrtc.org/experiments/rtp-hdrext/color-space\r\na=extmap:9 urn:ietf:params:rtp-hdrext:sdes:mid\r\na=extmap:10 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id\r\na=extmap:11 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id\r\na=sendrecv\r\na=msid:c5a992b3-4548-4f66-b7cc-80e5ee9edc31 847414c3-dfac-48c7-b496-47c63dde4848\r\na=rtcp-mux\r\na=rtcp-rsize\r\na=rtpmap:96 VP8/90000\r\na=rtcp-fb:96 goog-remb\r\na=rtcp-fb:96 transport-cc\r\na=rtcp-fb:96 ccm fir\r\na=rtcp-fb:96 nack\r\na=rtcp-fb:96 nack pli\r\na=rtpmap:97 rtx/90000\r\na=fmtp:97 apt=96\r\na=rtpmap:102 H264/90000\r\na=rtcp-fb:102 goog-remb\r\na=rtcp-fb:102 transport-cc\r\na=rtcp-fb:102 ccm fir\r\na=rtcp-fb:102 nack\r\na=rtcp-fb:102 nack pli\r\na=fmtp:102 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42001f\r\na=rtpmap:103 rtx/90000\r\na=fmtp:103 apt=102\r\na=rtpmap:104 H264/90000\r\na=rtcp-fb:104 goog-remb\r\na=rtcp-fb:104 transport-cc\r\na=rtcp-fb:104 ccm fir\r\na=rtcp-fb:104 nack\r\na=rtcp-fb:104 nack pli\r\na=fmtp:104 level-asymmetry-allowed=1;packetization-mode=0;profile-level-id=42001f\r\na=rtpmap:105 rtx/90000\r\na=fmtp:105 apt=104\r\na=rtpmap:106 H264/90000\r\na=rtcp-fb:106 goog-remb\r\na=rtcp-fb:106 transport-cc\r\na=rtcp-fb:106 ccm fir\r\na=rtcp-fb:106 nack\r\na=rtcp-fb:106 nack pli\r\na=fmtp:106 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42e01f\r\na=rtpmap:107 rtx/90000\r\na=fmtp:107 apt=106\r\na=rtpmap:108 H264/90000\r\na=rtcp-fb:108 goog-remb\r\na=rtcp-fb:108 transport-cc\r\na=rtcp-fb:108 ccm fir\r\na=rtcp-fb:108 nack\r\na=rtcp-fb:108 nack pli\r\na=fmtp:108 level-asymmetry-allowed=1;packetization-mode=0;profile-level-id=42e01f\r\na=rtpmap:109 rtx/90000\r\na=fmtp:109 apt=108\r\na=rtpmap:127 H264/90000\r\na=rtcp-fb:127 goog-remb\r\na=rtcp-fb:127 transport-cc\r\na=rtcp-fb:127 ccm fir\r\na=rtcp-fb:127 nack\r\na=rtcp-fb:127 nack pli\r\na=fmtp:127 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=4d001f\r\na=rtpmap:125 rtx/90000\r\na=fmtp:125 apt=127\r\na=rtpmap:39 H264/90000\r\na=rtcp-fb:39 goog-remb\r\na=rtcp-fb:39 transport-cc\r\na=rtcp-fb:39 ccm fir\r\na=rtcp-fb:39 nack\r\na=rtcp-fb:39 nack pli\r\na=fmtp:39 level-asymmetry-allowed=1;packetization-mode=0;profile-level-id=4d001f\r\na=rtpmap:40 rtx/90000\r\na=fmtp:40 apt=39\r\na=rtpmap:45 AV1/90000\r\na=rtcp-fb:45 goog-remb\r\na=rtcp-fb:45 transport-cc\r\na=rtcp-fb:45 ccm fir\r\na=rtcp-fb:45 nack\r\na=rtcp-fb:45 nack pli\r\na=rtpmap:46 rtx/90000\r\na=fmtp:46 apt=45\r\na=rtpmap:98 VP9/90000\r\na=rtcp-fb:98 goog-remb\r\na=rtcp-fb:98 transport-cc\r\na=rtcp-fb:98 ccm fir\r\na=rtcp-fb:98 nack\r\na=rtcp-fb:98 nack pli\r\na=fmtp:98 profile-id=0\r\na=rtpmap:99 rtx/90000\r\na=fmtp:99 apt=98\r\na=rtpmap:100 VP9/90000\r\na=rtcp-fb:100 goog-remb\r\na=rtcp-fb:100 transport-cc\r\na=rtcp-fb:100 ccm fir\r\na=rtcp-fb:100 nack\r\na=rtcp-fb:100 nack pli\r\na=fmtp:100 profile-id=2\r\na=rtpmap:101 rtx/90000\r\na=fmtp:101 apt=100\r\na=rtpmap:112 H264/90000\r\na=rtcp-fb:112 goog-remb\r\na=rtcp-fb:112 transport-cc\r\na=rtcp-fb:112 ccm fir\r\na=rtcp-fb:112 nack\r\na=rtcp-fb:112 nack pli\r\na=fmtp:112 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=64001f\r\na=rtpmap:113 rtx/90000\r\na=fmtp:113 apt=112\r\na=rtpmap:114 red/90000\r\na=rtpmap:115 rtx/90000\r\na=fmtp:115 apt=114\r\na=rtpmap:116 ulpfec/90000\r\na=ssrc-group:FID 3264700881 1173622635\r\na=ssrc:3264700881 cname:4lNbCw94xd47bElV\r\na=ssrc:3264700881 msid:c5a992b3-4548-4f66-b7cc-80e5ee9edc31 847414c3-dfac-48c7-b496-47c63dde4848\r\na=ssrc:1173622635 cname:4lNbCw94xd47bElV\r\na=ssrc:1173622635 msid:c5a992b3-4548-4f66-b7cc-80e5ee9edc31 847414c3-dfac-48c7-b496-47c63dde4848\r\n","type":"offer"}}
{"request":"switch","streams":[{"feed":196447998257,"mid":"0","sub_mid":"0"}]}
phwang4 commented 1 year ago

Update: I was able to reproduce this issue even when gracefully leaving calls. If I delay the switch message for about half a second so that it comes after the detach from janus for the organizer when the user leaves, that mid will have no data.

For the not working case, we actually get a DTLS alert (line 1951) instead of in the working case we get a logger "Hanging up PeerConnection because of a Detach" (line 1830). The DTLS alert likely does a similar cleanup to get into that "Detaching handle from JANUS SIP plugin" message, which may be why we're seeing this bug.

Here is the janus-delay log for the scenario discussed above.

The videoroom is 196706758490 Line 4731: This is where janus does the detach Line 4802: because I added in the half second delay, the switch to dummy comes afterwards And once the user comes back the organizer sees no data on that mid.

phwang4 commented 1 year ago

Update: I figured out the issue. I noticed that I was getting an extra sdp offer after the switch request that I wasn't doing anything with, and from the janus videoroom docs:

Notice that, while a switch request usually doesn't require a renegotiation, it MIGHT trigger one nevertheless

I had disregarded this initially since the case mentioned wasn't relevant (had dummy publishers so the subscriber streams would be active), but this seems to be one of those cases where it is getting triggered anyway. So after doing a renegotiation the video is flowing properly now.