Open bkempe opened 5 years ago
There is retry logic in the
BridgeChannel.js
but only for websocket. Why is that?
This is because we establish the WebRTC data channel as part of ICE. When you interrupt the connection to the bridge, this will trigger an ICE failure and the bridge channel will be-reestablished with the new ICE session.
In your case the client sent a notification to jicofo that its ICE session failed, and jicofo then sent a re-invite (via a transport-replace). The new ICE session failed, because it was on the same bridge where packets are filtered. Audio and video continued to work because they are P2P.
As Emil mentioned, the preferred way forward is to use jitsi-videobridge with WebSockets. In this case you would see independent retries for the ICE session (if it fails) and the WebSocket to jvb (if it fails).
I was looking for a document which describes how to setup jitsi-videobridge with WebSockets but I dind't find one. I'll write something up and send a link later today.
In general even the SCTP data channels are supposed to come back after ICE restart, but probably something is broken.
I think after the ICE restart the bridge ICE session never succeeds because the filter is still in place, but audio/video continues to work because it's using the p2p peer connection. I was able to reproduce it.
@bkempe Here is the doc for setting up colibri websockets. If you follow it let me know whether it works or not, I might have forgotten some parts.
Thanks @bgrozev, this doc is very helpful and easy to follow.
One thing I noticed is that using the port 8080 for org.jitsi.videobridge.rest.jetty.port
might clash with the default value for org.jitsi.videobridge.rest.private.jetty.port
.
This may also have been the problem in this ticket: https://github.com/jitsi/jitsi-videobridge/issues/657
Good point, I'll change that.
Hello, we are able to reproduce this issue of BridgeChannel staying closed and not being able to reopen it. In our case we have our own deployment with kubernetes, with multiple JVB (using StatefulSet like here https://github.com/hpi-schul-cloud/jitsi-deployment/blob/master/base/jitsi-shard/jvb/jvb-statefulset.yaml).
When one of the JVB pods dies, we can see that clients reconnect automatically to one of the other JVBs available. By debugging the JS we can see that this line is executed https://github.com/jitsi/lib-jitsi-meet/blob/master/modules/xmpp/strophe.jingle.js#L205, and BridgeChannel is initialized again, but immediately fails with error undefined:
Logger.js:154 2020-08-25T23:05:18.284Z [modules/RTC/BridgeChannel.js] <RTCDataChannel.e.onerror>: Channel error: undefined
Logger.js:154 2020-08-25T23:05:18.285Z [modules/RTC/BridgeChannel.js] <RTCDataChannel.e.onclose>: Channel closed by server
Logger.js:154 2020-08-25T23:05:18.573Z [modules/RTC/BridgeChannel.js] <l._send>: Bridge Channel send: no opened channel.
Logger.js:154 2020-08-25T23:05:29.001Z [modules/RTC/BridgeChannel.js] <l._send>: Bridge Channel send: no opened channel.
It is worth noting that the candidates on the initial logging of SDP is like this:
2020-08-25T23:00:40.015Z [modules/RTC/TraceablePeerConnection.js] <A.trace>: getRemoteDescription::preTransform type: offer
...
a=candidate:1 1 udp 2130706431 <local_IP> 32101 typ host generation 0
a=candidate:2 1 udp 1694498815 <public_JVB_IP> 32101 typ srflx raddr <local_IP> rport 32101 generation 0
But after connecting to the failover JVB, we get these listed:
a=candidate:1 1 udp 2130706431 <local_IP> 32100 typ host generation 0
a=candidate:2 1 udp 1694498815 <public_JVB_IP> 32100 typ srflx raddr <local_IP> rport 32100 generation 0
a=candidate:1 1 udp 2130706431 <local_IP> 32101 typ host generation 0
a=candidate:2 1 udp 1694498815 <public_JVB_IP> 32101 typ srflx raddr <local_IP> rport 32101 generation 0
a=candidate:1 1 udp 2130706431 <local_IP> 32101 typ host generation 0
a=candidate:2 1 udp 1694498815 <public_JVB_IP> 32101 typ srflx raddr <local_IP> rport 32101 generation 0
Which makes us think that it is still having the old one in port 32101 as candidate and that might be why opening the data channel fails. At this point clients can see each other's video, and by inspecting in Wireshark we can see that they are connected via UDP to the bridge in port 32100.
We tried to do this manual call to initialize the channel again like this:
conference.rtc.initializeBridgeChannel(conference.getActivePeerConnection(), null);
And it fails with the same error. We don't have any event that we can listen to know when the data channel was closed, which makes us harder to detect when this happens in our application, resulting in clients that cannot exchange data messages with the bridge.
Could this also be related to this observation of crashing Firefox (but not chrome) https://community.jitsi.org/t/firefox-and-safari-endless-loop-with-tounifiedplan/99940
I have the same issue, look like we do not have any event to catch this type of error on Javascript level.
The error was already thrown on native side before it can come up to Javascript side
Description
We're testing the behavior of a JVB connection interruption (no XMPP interruption) and its effect on the Jitsi Meet frontend.
Current behavior
A JVB connection interruption of a few seconds leads to the following JS Console and the
BridgeChannel
stays closed / does not become available anymore:Expected Behavior
BridgeChannel
reconnectsPossible Solution
There is retry logic in the
BridgeChannel.js
but only for websocket. Why is that?Steps to reproduce
Disable the JVB connection via
on the JVB instance, then opening the port again after some time via
Environment details
Latest Jitsi