jitsi / jitsi-meet

Jitsi Meet - Secure, Simple and Scalable Video Conferences that you use as a standalone app or embed in your web application.
https://jitsi.org/meet
Apache License 2.0
23.06k stars 6.7k forks source link

Can not invite participant -- no bridge available. #2172

Closed ghost closed 6 years ago

ghost commented 6 years ago

Hi!

This is a long running jitsi installation (0.5 year) that after some time started failing.

After some time jitsi-meet is running, when someone tries to do a conference both browser clients crash.

jvb.log says:

JVB 2017-11-14 10:10:09.829 FINE: [501] org.jitsi.videobridge.xmpp.ComponentImpl.processIQ() (serving component 'JitsiVideobridge') Processing IQ (packetId 83a76-51612): <iq id="83a76-51612" type="result" to="jitsi-videobridge.meet.guifi.net" from="meet.guifi.net"/>
JVB 2017-11-14 10:10:09.829 FINE: [501] org.jitsi.videobridge.xmpp.ComponentImpl.log() RECV: <iq id="83a76-51612" type="result" to="jitsi-videobridge.meet.guifi.net" from="meet.guifi.net"/>
JVB 2017-11-14 10:10:11.138 FINE: [503] org.jitsi.videobridge.xmpp.ComponentImpl.processIQ() (serving component 'JitsiVideobridge') Processing IQ (packetId AkQL6-236189): <iq type="get" to="jitsi-videobridge.meet.guifi.net" from="focus@auth.meet.guifi.net/focus1071006487923453" id="AkQL6-236189"><query xmlns="http://jabber.org/protocol/disco#info"/></iq>
JVB 2017-11-14 10:10:11.138 FINE: [503] org.jitsi.videobridge.xmpp.ComponentImpl.processIQRequest() (serving component 'JitsiVideobridge') Processing IQ request (packetId AkQL6-236189).
JVB 2017-11-14 10:10:11.138 FINE: [503] org.jitsi.videobridge.xmpp.ComponentImpl.processIQ() (serving component 'JitsiVideobridge') Responding to IQ (packetId AkQL6-236189) with: <iq type="result" id="AkQL6-236189" from="jitsi-videobridge.meet.guifi.net" to="focus@auth.meet.guifi.net/focus1071006487923453"><query xmlns="http://jabber.org/protocol/disco#info"><identity category="component" type="conference" name="JitsiVideobridge"/><feature var="http://jabber.org/protocol/disco#info"/><feature var="urn:xmpp:ping"/><feature var="jabber:iq:last"/><feature var="urn:xmpp:time"/><feature var="http://jitsi.org/protocol/colibri"/><feature var="http://jitsi.org/protocol/healthcheck"/><feature var="urn:xmpp:jingle:apps:dtls:0"/><feature var="urn:xmpp:jingle:transports:ice-udp:1"/><feature var="urn:xmpp:jingle:transports:raw-udp:1"/><feature var="jabber:iq:version"/></query></iq>
JVB 2017-11-14 10:10:19.830 FINE: [36] org.jitsi.videobridge.xmpp.ComponentImpl.processIQ() (serving component 'JitsiVideobridge') Processing IQ (packetId 83a76-51613): <iq id="83a76-51613" type="result" to="jitsi-videobridge.meet.guifi.net" from="meet.guifi.net"/>
JVB 2017-11-14 10:10:19.830 FINE: [36] org.jitsi.videobridge.xmpp.ComponentImpl.log() RECV: <iq id="83a76-51613" type="result" to="jitsi-videobridge.meet.guifi.net" from="meet.guifi.net"/>
JVB 2017-11-14 10:10:29.830 FINE: [26] org.jitsi.videobridge.xmpp.ComponentImpl.processIQ() (serving component 'JitsiVideobridge') Processing IQ (packetId 83a76-51614): <iq id="83a76-51614" type="result" to="jitsi-videobridge.meet.guifi.net" from="meet.guifi.net"/>
JVB 2017-11-14 10:10:29.830 FINE: [26] org.jitsi.videobridge.xmpp.ComponentImpl.log() RECV: <iq id="83a76-51614" type="result" to="jitsi-videobridge.meet.guifi.net" from="meet.guifi.net"/>
JVB 2017-11-14 10:10:39.830 FINE: [613] org.jitsi.videobridge.xmpp.ComponentImpl.processIQ() (serving component 'JitsiVideobridge') Processing IQ (packetId 83a76-51615): <iq id="83a76-51615" type="result" to="jitsi-videobridge.meet.guifi.net" from="meet.guifi.net"/>
JVB 2017-11-14 10:10:39.830 FINE: [613] org.jitsi.videobridge.xmpp.ComponentImpl.log() RECV: <iq id="83a76-51615" type="result" to="jitsi-videobridge.meet.guifi.net" from="meet.guifi.net"/>
JVB 2017-11-14 10:10:41.138 FINE: [30] org.jitsi.videobridge.xmpp.ComponentImpl.processIQ() (serving component 'JitsiVideobridge') Processing IQ (packetId AkQL6-236201): <iq type="get" to="jitsi-videobridge.meet.guifi.net" from="focus@auth.meet.guifi.net/focus1071006487923453" id="AkQL6-236201"><query xmlns="http://jabber.org/protocol/disco#info"/></iq>
JVB 2017-11-14 10:10:41.138 FINE: [30] org.jitsi.videobridge.xmpp.ComponentImpl.processIQRequest() (serving component 'JitsiVideobridge') Processing IQ request (packetId AkQL6-236201).

meanwhile jvb.log is FINE:

Jicofo 2017-11-14 09:56:40.598 INFO: [86] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Lip-sync enabled in room@conference.meet.guifi.net
Jicofo 2017-11-14 09:56:40.598 INFO: [86] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Joining the room: room@conference.meet.guifi.net
Jicofo 2017-11-14 09:56:40.599 INFO: [55] org.jitsi.jicofo.ChatRoomRoleAndPresence.log() Chat room event ChatRoomMemberPresenceChangeEvent[type=MemberJoined sourceRoom=org.jitsi.impl.protocol.xmpp.ChatRoomImpl@1fced781 member=ChatMember[room@conference.meet.guifi.net/focus, jid: null]@283587152]
Jicofo 2017-11-14 09:56:40.599 WARNING: [55] org.jitsi.jicofo.ChatRoomRoleAndPresence.log() Focus role unknown
Jicofo 2017-11-14 09:56:40.599 INFO: [55] org.jitsi.jicofo.ChatRoomRoleAndPresence.log() Obtained focus role: OWNER
Jicofo 2017-11-14 09:56:40.599 INFO: [55] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Member room@conference.meet.guifi.net/focus joined.
Jicofo 2017-11-14 09:56:40.600 INFO: [86] org.jitsi.jicofo.JitsiMeetRecording.log() No recorder service discovered.
Jicofo 2017-11-14 09:56:40.719 INFO: [55] org.jitsi.jicofo.ChatRoomRoleAndPresence.log() Chat room event ChatRoomMemberPresenceChangeEvent[type=MemberJoined sourceRoom=org.jitsi.impl.protocol.xmpp.ChatRoomImpl@1fced781 member=ChatMember[room@conference.meet.guifi.net/8b338772, jid: null]@1031193953]
Jicofo 2017-11-14 09:56:40.720 INFO: [55] org.jitsi.jicofo.ChatRoomRoleAndPresence.log() Granted owner to room@conference.meet.guifi.net/8b338772
Jicofo 2017-11-14 09:56:40.720 INFO: [55] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Member room@conference.meet.guifi.net/8b338772 joined.
Jicofo 2017-11-14 10:10:36.131 INFO: [63] org.jitsi.jicofo.xmpp.FocusComponent.handleConferenceIq().396 Focus request for room: room@conference.meet.guifi.net
Jicofo 2017-11-14 10:10:36.172 INFO: [55] org.jitsi.jicofo.ChatRoomRoleAndPresence.log() Chat room event ChatRoomMemberPresenceChangeEvent[type=MemberJoined sourceRoom=org.jitsi.impl.protocol.xmpp.ChatRoomImpl@1fced781 member=ChatMember[room@conference.meet.guifi.net/781016a4, jid: null]@1264931942]
Jicofo 2017-11-14 10:10:36.172 INFO: [55] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Member room@conference.meet.guifi.net/781016a4 joined.
Jicofo 2017-11-14 10:10:36.173 SEVERE: [55] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Can not invite participant -- no bridge available.
Jicofo 2017-11-14 10:10:36.173 SEVERE: [55] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Can not invite participant -- no bridge available.
Jicofo 2017-11-14 10:10:36.460 INFO: [55] org.jitsi.jicofo.ChatRoomRoleAndPresence.log() Chat room event ChatRoomMemberPresenceChangeEvent[type=MemberLeft sourceRoom=org.jitsi.impl.protocol.xmpp.ChatRoomImpl@1fced781 member=ChatMember[room@conference.meet.guifi.net/781016a4, jid: null]@1264931942]
Jicofo 2017-11-14 10:10:36.460 INFO: [55] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Member room@conference.meet.guifi.net/781016a4 is leaving
Jicofo 2017-11-14 10:10:36.460 INFO: [55] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Removed participant: true, room@conference.meet.guifi.net/781016a4

You can try yourself at: meet.guifi.net

Its nearly vanilla configuration (unchecked third party software like random gravatars). But inside this machine, there is a matrix.org and rocketchat (docker). Perhaps all these applications are fighting with the ports?

This is a stable jitsi installation:

# apt-cache show jitsi
Package: jitsi
Architecture: amd64
Version: 2.10.5550-1

Thanks! Pedro

damencho commented 6 years ago

For some reason jicofo thinks there is no jvb instance. It is stopped or something went wrong. Uploading the whole log files may reveal more. Restarting jvb will fix it in short term. The version you had pointed is the jitsi-desktop client and has nothing to do with jitsi-meet.

ghost commented 6 years ago

For some reason jicofo thinks there is no jvb instance. It is stopped or something went wrong.

# service jitsi-videobridge status
● jitsi-videobridge.service - LSB: Jitsi Videobridge
   Loaded: loaded (/etc/init.d/jitsi-videobridge; generated; vendor preset: enabled)
   Active: active (running) since Wed 2017-11-08 10:47:48 CET; 6 days ago
     Docs: man:systemd-sysv-generator(8)
    Tasks: 86 (limit: 4915)
   Memory: 243.9M
      CPU: 43min 48.395s
   CGroup: /system.slice/jitsi-videobridge.service
           └─8372 java -Xmx3072m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp -Djava.library.path=/usr/share/jitsi-videobridge/lib/native/linux-64 -Dnet.java.s

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

Uploading the whole log files may reveal more.

Should we proceed this way? -> https://github.com/jitsi/jitsi-meet/issues/1390#issuecomment-290500479

Restarting jvb will fix it in short term.

Exactly... but it fails after some time running

The version you had pointed is the jitsi-desktop client and has nothing to do with jitsi-meet.

Ouch!

# dpkg -l | grep jitsi
ii  jitsi-meet                      1.0.2467-1                     all          WebRTC JavaScript video conferences
ii  jitsi-meet-prosody              1.0.2301-1                     all          Prosody configuration for Jitsi Meet
ii  jitsi-meet-web                  1.0.2301-1                     all          WebRTC JavaScript video conferences
ii  jitsi-meet-web-config           1.0.2301-1                     all          Configuration for web serving of Jitsi Meet
ii  jitsi-videobridge               995-1                          amd64        WebRTC compatible Selective Forwarding Unit (SFU)
# dpkg -l | grep jicofo
ii  jicofo                          1.0-371-1                      amd64        JItsi Meet COnference FOcus
damencho commented 6 years ago

As a start, just upload all jvb and jicofo logs from /var/log/jitsi

damencho commented 6 years ago

I see this in the logs. Jicofo 2017-11-13 20:20:47.580 WARNING: [38] org.jitsi.jicofo.JvbDoctor.log() Health check failed on: jitsi-videobridge.meet.guifi.net error: java.net.SocketException: Too many open files (Error creating socket)

We haven't seen this for months now, that is strange. So my proposal is to restart jvb and jicofo and when you see the problem can you collect the logs as described in the #1390 (comment) and send it to me so I can take a look.

Thanks for the report and the cooperation.

ghost commented 6 years ago

Hi again,

right now I had to resize the virtual disk for this virtual machine (after some little security upgrades): no space left said (~40 MB left). Is this error an early alert that jitsi-meet is not having enough space to continue working?

It had 4 GB (yes, I know is very tiny), now it has 8 GB. We will see

damencho commented 6 years ago

No this is not free disk space. There is a limitation of the number of open file handles by a user/process that can be used. And it seems it reaches this limit. There is a ulimit command to make it higher. But also reaching that it means there can be a bug, at least the previous time we found one and fixed it and we haven't seen reports for that recently, till now.

olivierb2 commented 6 years ago

+1

damencho commented 6 years ago

@olivierb2 Can you give more details, are you seeing the same error in jicofo logs: Health check failed on: jitsi-videobridge.meet.guifi.net error: java.net.SocketException: Too many open files (Error creating socket) If you experience that can you try doing the heap dump as explained

olivierb2 commented 6 years ago

Hi @damencho Apologies for the +1, I just realize later that I was using the unstable build, I redone the installation from the stable, and yet it seems much more better. I come back to you if I discover any issue.

paulvt commented 6 years ago

I have encountered the problem too. The logs (both jicofo and jvb) report nothing out of the ordinary, i.e. not on the INFO/FINE log level, except for:

org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Can not invite participant -- no bridge available

I can see that both jicofo and jvb are running.

I had to reboot the machine (it's a production instance) and that solved the issue for now. I will report back if I find out more.

saghul commented 6 years ago

OK, I'm going to close this then. Please let us know if you still face issues.

TheReal1604 commented 6 years ago

@saghul @damencho i am facing the exactly same error as @paulvt on a very rare used instance. Usually we use this instance all 2-3~ Weeks for one meeting. Everytime I start an meeting after this spare time the exact same message is logged:

org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Can not invite participant -- no bridge available

This can only be resolved if I restart jicofo. After that it seems to be okay.

damencho commented 6 years ago

@TheReal1604 it will be helpful when that happens, to check all the logs or send it to me privately if you want and let me check it. All files in /var/log/jitsi. In the jicofo log there should be a failed healthcheck against jvb and it should have some message about why that failed. And also at the same time what were the logs in jvb.

TheReal1604 commented 6 years ago

@damencho send you an email with a link to my logs. We will report here soon.. :-)

ZsZs73 commented 6 years ago

Hi, Apparently I have the same problem or very similar. My jvb log also reports "Too many open files". Below there are two health checks. The first one without error and the second one with error.

JVB 2018-04-26 01:24:56.126 FINE: [24] org.jitsi.videobridge.xmpp.ComponentImpl.processIQRequest() (serving component 'JitsiVideobridge') Processing IQ request (packetId Sln4t-92193).
JVB 2018-04-26 01:24:56.126 FINE: [24] org.jitsi.videobridge.xmpp.ComponentImpl.log() RECV: <iq type="get" to="jitsi-videobridge.jitsi.hq.somecorp.com" from="focus@auth.jitsi.hq.somecorp.com/focus347271548652922" id="Sln4t-92193"><healthcheck xmlns="http://jitsi.org/protocol/healthcheck"/></iq>
JVB 2018-04-26 01:24:56.126 INFO: [24] org.jitsi.videobridge.Videobridge.log() CAT=stat create_conf,conf_id=9ecc0c877669b296 conf_name=null,logging=false,conf_count=1,ch_count=0,v_streams=0
JVB 2018-04-26 01:24:56.130 INFO: [24] org.jitsi.impl.neomedia.VideoMediaStreamImpl.log() Creating a BandwidthEstimator for stream org.jitsi.impl.neomedia.VideoMediaStreamImpl@55bacc0a
JVB 2018-04-26 01:24:56.134 INFO: [24] org.jitsi.impl.neomedia.VideoMediaStreamImpl.log() Creating a BandwidthEstimator for stream org.jitsi.impl.neomedia.VideoMediaStreamImpl@51401e8b
JVB 2018-04-26 01:24:56.145 FINE: [24] org.jitsi.videobridge.xmpp.ComponentImpl.log() SENT: <iq to="focus@auth.jitsi.hq.somecorp.com/focus347271548652922" from="jitsi-videobridge.jitsi.hq.somecorp.com" id="Sln4t-92193" type="result"/>
JVB 2018-04-26 01:24:56.145 FINE: [24] org.jitsi.videobridge.xmpp.ComponentImpl.processIQ() (serving component 'JitsiVideobridge') Responding to IQ (packetId Sln4t-92193) with: <iq to="focus@auth.jitsi.hq.somecorp.com/focus347271548652922" from="jitsi-videobridge.jitsi.hq.somecorp.com" id="Sln4t-92193" type="result"/>

JVB 2018-04-26 01:24:58.281 FINE: [34] org.jitsi.videobridge.xmpp.ComponentImpl.processIQ() (serving component 'JitsiVideobridge') Processing IQ (packetId puSV8-181423): <iq id="puSV8-181423" type="result" to="jitsi-videobridge.jitsi.hq.somecorp.com" from="jitsi.hq.somecorp.com"/>
JVB 2018-04-26 01:24:58.281 FINE: [34] org.jitsi.videobridge.xmpp.ComponentImpl.log() RECV: <iq id="puSV8-181423" type="result" to="jitsi-videobridge.jitsi.hq.somecorp.com" from="jitsi.hq.somecorp.com"/>

JVB 2018-04-26 01:25:06.126 FINE: [184] org.jitsi.videobridge.xmpp.ComponentImpl.processIQ() (serving component 'JitsiVideobridge') Processing IQ (packetId Sln4t-92197): <iq type="get" to="jitsi-videobridge.jitsi.hq.somecorp.com" from="focus@auth.jitsi.hq.somecorp.com/focus347271548652922" id="Sln4t-92197"><healthcheck xmlns="http://jitsi.org/protocol/healthcheck"/></iq>
JVB 2018-04-26 01:25:06.126 FINE: [184] org.jitsi.videobridge.xmpp.ComponentImpl.processIQRequest() (serving component 'JitsiVideobridge') Processing IQ request (packetId Sln4t-92197).
JVB 2018-04-26 01:25:06.131 FINE: [184] org.jitsi.videobridge.xmpp.ComponentImpl.log() RECV: <iq type="get" to="jitsi-videobridge.jitsi.hq.somecorp.com" from="focus@auth.jitsi.hq.somecorp.com/focus347271548652922" id="Sln4t-92197"><healthcheck xmlns="http://jitsi.org/protocol/healthcheck"/></iq>
JVB 2018-04-26 01:25:06.132 INFO: [184] org.jitsi.videobridge.Videobridge.log() CAT=stat create_conf,conf_id=55a3018dcfc90a68 conf_name=null,logging=false,conf_count=1,ch_count=0,v_streams=0
JVB 2018-04-26 01:25:06.135 INFO: [184] org.jitsi.impl.neomedia.VideoMediaStreamImpl.log() Creating a BandwidthEstimator for stream org.jitsi.impl.neomedia.VideoMediaStreamImpl@3c9f1f0f
JVB 2018-04-26 01:25:06.141 INFO: [184] org.jitsi.impl.neomedia.VideoMediaStreamImpl.log() Creating a BandwidthEstimator for stream org.jitsi.impl.neomedia.VideoMediaStreamImpl@f4ddaa2
JVB 2018-04-26 01:25:06.150 FINE: [184] org.jitsi.videobridge.xmpp.ComponentImpl.log() SENT: <iq to="focus@auth.jitsi.hq.somecorp.com/focus347271548652922" from="jitsi-videobridge.jitsi.hq.somecorp.com" id="Sln4t-92197" type="error"><error type="cancel"><internal-server-error xmlns="urn:ietf:params:xml:ns:xmpp-stanzas"></internal-server-error><text xmlns="urn:ietf:params:xml:ns:xmpp-stanzas" xml:lang="en">java.net.SocketException: Too many open files (Error creating socket)</text></error></iq>
JVB 2018-04-26 01:25:06.150 FINE: [184] org.jitsi.videobridge.xmpp.ComponentImpl.processIQ() (serving component 'JitsiVideobridge') Responding to IQ (packetId Sln4t-92197) with: <iq to="focus@auth.jitsi.hq.somecorp.com/focus347271548652922" from="jitsi-videobridge.jitsi.hq.somecorp.com" id="Sln4t-92197" type="error"><error type="cancel"><internal-server-error xmlns="urn:ietf:params:xml:ns:xmpp-stanzas"></internal-server-error><text xmlns="urn:ietf:params:xml:ns:xmpp-stanzas" xml:lang="en">java.net.SocketException: Too many open files (Error creating socket)</text></error></iq>

There are tons of open UDP sockets:

lsof|grep jvb| grep "protocol: UDP" |wc -l
317305

After restarting jitsi-videobridge service everything is back to normal.

edit: Ooops, forgot to mention that it is the latest available version of jitsi on Debian 9:

dpkg -l|grep jitsi
ii  jitsi-meet             1.0.2794-1
ii  jitsi-meet-prosody     1.0.2579-1
ii  jitsi-meet-web         1.0.2579-1
ii  jitsi-meet-web-config  1.0.2579-1
ii  jitsi-videobridge      1030-1
damencho commented 6 years ago

@ZsZs73

After restarting jvb did the UDP sockets were cleared?

To be able to debug the problem it will be very useful to get a heapdump of jvb. So when you see the problem can you execute as root: cd /tmp (it should be a folder where user jvb can write) /usr/share/jitsi-videobridge/collect-dump-logs.sh Then you can upload it somewhere and send me the link privately damencho AT jitsi.org. And if you can also send me there the output of ifconfig. Thanks.

ZsZs73 commented 6 years ago

@damencho Yes, the number of open UDP sockets went back to 0 after restarting jvb, but 5.5 hours later it is already:

lsof|grep jvb| grep "protocol: UDP" |wc -l
48108

This is just a test system, so there might be just a few VCs since jvb restart. I'll collect the dump/logs once it failing again.

update: ipv6 is disabled

damencho commented 6 years ago

Well, if you see so much for 5 hours, you can collect the logs now and send it to me so I can take a look and will let you know do I need anything more. Thanks!

damencho commented 6 years ago

A quick solution to your problem will be to disable health checking from jicofo. Add org.jitsi.jicofo.HEALTH_CHECK_INTERVAL=-1 to /etc/jitsi/jicofo/sip-communicator.properties and restart jicofo.

ghost commented 6 years ago

Wow @ZsZs73 ! Great contribution detecting this bug

In my server in some seconds we can see crazy UDP usage (theoretically now the server is not being used). I'm going to send you now the report privately @damencho

[10:13:43] root@host:~# lsof|grep jvb| grep "protocol: UDP" |wc -l
66649
[10:13:46] root@host:~# lsof|grep jvb| grep "protocol: UDP" |wc -l
49129
[10:14:09] root@host:~# lsof|grep jvb| grep "protocol: UDP" |wc -l
50297
[10:14:15] root@host:~# lsof|grep jvb| grep "protocol: UDP" |wc -l
50297
[10:14:19] root@host:~# lsof|grep jvb| grep "protocol: UDP" |wc -l
50297
[10:14:23] root@host:~# lsof|grep jvb| grep "protocol: UDP" |wc -l
51465
[10:14:27] root@host:~# lsof|grep jvb| grep "protocol: UDP" |wc -l
51465
damencho commented 6 years ago

I will check you dumb, but the immediate solution is disabling the health checks from jicofo. I spent yesterday looking at this, and I was able to reproduce it. There is no memory leak or any problem in the code. All the objects are unreachable in the memory it is that it takes time for GC to collect them and so clean the file descriptors used. But why it takes so much time for the garbage collector to collect them, I have no idea. If you try and force GC with a command # sudo -u jvb jcmd GC.run you will see that that number becomes almost 0.

ZsZs73 commented 6 years ago

Hi @damencho amencho, The forced garbage collector run takes less than a second and I conform that it clears the UDP connections, but it is not enough to make jicofo happy. It continues to complain about org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Can not invite participant -- no bridge available. Only restarting jicofo made it operational again. The proposed workaround (disabling health check) seems to work, at least the number of UDP connections do not increase quickly.

damencho commented 6 years ago

You can disable those health checks: org.jitsi.jicofo.HEALTH_CHECK_INTERVAL=-1 adding it to jicofo config and you will not have that problem.

ZsZs73 commented 6 years ago

I am trying an another approach right now. I've just created following cron job:

cat >/etc/cron.d/jvbGC <<'EOF'
# Force runing Garbage Collector
# See https://github.com/jitsi/jitsi-meet/issues/2172
7 * * * * jvb /usr/bin/jcmd $(ps auxw|grep org.jitsi.videobridge.Main|grep -v grep| awk '{print $2}') GC.run
EOF

This runs the garbage collector hourly and keeps the open UDP connections low so far. I've re-enabled the health check in jicofo config. Let's see how it works.

ZsZs73 commented 6 years ago

@damencho, @pedro-nonfree The workaround described above works since more than a week. It is not going to heal the jvb if jicofo is already complaining. It should be applied together with restart of jvb and jicofo.

ghost commented 6 years ago

@damencho what are the (potential) consequences of holding healthcheck disabled?

damencho commented 6 years ago

Well, health checks are mostly for an environment with multiple jvb instances. When one fails we remove it from the pool of available jvb's and the infrastructure will take care to delete/restart or whatever to try bringing it up again, and in the meanwhile, no conferences will be sent to that jvb till it got healthy again. With one jvb there is no difference whether it is there and it is not working, or it is not there and not working :)

ZsZs73 commented 6 years ago

@damencho That is true, but apparently the scheduled/forced GC keeps jvb working with health check enabled at jicofo. I am far not familiar with the JVM structure. Is there a built-in GC running by default in the JVM? Does GC produce any logs?

damencho commented 6 years ago

Yep there is GC, but I have no idea why forcing helps, but regular not.

ZsZs73 commented 6 years ago

According to this: https://stackoverflow.com/questions/37576330/when-sockets-are-dereferenced-are-they-garbage-collected GC runs only if the available memory/resources are low. The link seems to contain valuable input about connection handling for those who speak Java. I am going to enable GC logging hoping that it will shed some light.

ZsZs73 commented 6 years ago

Just for the records: How to enable GC logging for jvb Add following line in /etc/jitsi/videobridge/config right after JVB_OPTS=""

JVB_EXTRA_JVM_PARAMS="-Xloggc:/var/log/jitsi/jvb-gc.log -XX:+PrintGCDetails \
-XX:+PrintGCDateStamps -XX:+PrintGCCause -XX:+UseGCLogFileRotation \
-XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=5M"

then restart jvb

damencho commented 6 years ago

You can try decreasing the memory used by jvb. I think it is 3gb by default. Maybe reducing it to 1 or 2 will help.

twoprops commented 6 years ago

I'm having a very similar problem. As soon as a second user attempts to connect, I get the "unfortunately, something went wrong" error message. There's nothing above the "FINE" level in jvb.log and the only errors in jicofo.log are a pair of "SEVERE: [37] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Can not invite participant -- no bridge available."

I have health checks disabled and, unlike this users above, this error will occur upon a connection attempt even immediately after restarting jvb and jicofo and, in fact, even immediately after a system reboot.

Is there any other place I might look to get any more information?

Thanks.

damencho commented 6 years ago

Check jvb logs after you restart it.

twoprops commented 6 years ago

There are many, many INFO and FINE messages. The only section that vaguely looks like an error or warning is

JVB 2018-08-12 21:59:23.427 INFO: [1] util.NetworkUtils.<clinit>().118 java.net.preferIPv4Stack=null
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/share/jitsi-videobridge/lib/slf4j-jdk14-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/share/jitsi-videobridge/lib/slf4j-simple-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
JVB 2018-08-12 21:59:23.666 INFO: [11] org.jitsi.service.libjitsi.LibJitsi.log() Successfully started LibJitsi using as implemen
tation: org.jitsi.impl.libjitsi.LibJitsiOSGiImpl

Everything else looks like just reporting on options being set and connections being made (bearing in mind that I have a tremendous level of ignorance of the internals of jitsi). The "multiple_bindings" explanation doesn't make sense to me owing to my lack of context.

--Ron

twoprops commented 6 years ago

Upon attempting a connection, I also captured these jvb log items that I'd missed before:

JVB 2018-08-12 22:31:53.109 SEVERE: [121] org.jitsi.sctp4j.Sctp.log() Init'ing brian's patched usrsctp
=====>: org_jitsi_sctp4j_Sctp.c calling init
=====>: org_jitsi_sctp4j_Sctp.c about to set SCTP_DEBUG_ALL
=====>: org_jitsi_sctp4j_Sctp.c setting SCTP_DEBUG_ALL
JVB 2018-08-12 22:31:53.258 WARNING: [100] org.jitsi.impl.neomedia.transform.dtls.DatagramTransportImpl.log() Unknown DTLS handshake message type: -101
JVB 2018-08-12 22:31:53.260 WARNING: [100] org.jitsi.impl.neomedia.transform.dtls.DatagramTransportImpl.log() Unknown DTLS handshake message type: 86
JVB 2018-08-12 22:31:53.289 INFO: [100] org.jitsi.impl.neomedia.transform.srtp.OpenSSLWrapperLoader.log() jnopenssl successfully loaded
JVB 2018-08-12 22:31:53.313 INFO: [76] org.jitsi.videobridge.Conference.log() CAT=stat ds_change,conf_id=a5c963dab54459cb ds_id=57f7ea89
JVB 2018-08-12 22:31:53.313 WARNING: [76] org.jitsi.videobridge.EndpointMessageTransport.log() SCTP connection with 57f7ea89 not ready yet.
JVB 2018-08-12 22:31:53.313 WARNING: [76] org.jitsi.videobridge.EndpointMessageTransport.log() No available transport channel, can't send a message
JVB 2018-08-12 22:31:53.313 WARNING: [76] org.jitsi.videobridge.EndpointMessageTransport.log() SCTP connection with ddf879ff not ready yet.
JVB 2018-08-12 22:31:53.313 WARNING: [76] org.jitsi.videobridge.EndpointMessageTransport.log() No available transport channel, can't send a message

There were then several other places where the "WARNING...Unknown DTLS handshake message ... No available transport" messages repeat.

Sorry that I didn't see those before.

--r

damencho commented 6 years ago

@twoprops I don't understand. You say you see this in jicofo logs "Can not invite participant -- no bridge available", this means that no jvb is available if this is the case no conference will use jvb so logs as the one from the above should no appear. Do you no longer see that log in jicofo "no bridge available"?

twoprops commented 6 years ago

You are correct; there were no "can not invite" errors in the jicofo logs this time (nor had I missed the JVB errors before; they were not there). To make matters even more confusing, it actually connected once (audio only, no video and video could not be turned on), then after a bit went back to the previous "something has gone wrong..." behavior.

I think before I waste any more of your time, I'll just nuke the entire server and re-install from scratch. Since this instance ran for many months without a problem, I suspect some deep system corruption.

Thanks for your attention and the work you have done on this. I will let you know if I continue to experience problems with a new install.

--r

adicatur commented 4 years ago

Hi @damencho @ZsZs73

Thanks for this discussion, I have a question, what's your suggestion if we want to have multiple JVB, besides we need to enable the health-check? Is it only reduce the JVB memory or and using cron to remove the UDP works so far, or do you have additional information?

Thanks

mcr commented 3 years ago

After many months of stable operation, my server just went into this. I do not see any issue with not enough file descriptors. I see:

HealthChecker.run#170: Performed a successful health check in PT0S. Sticky failure: false

but, fundamentally, when a second user joins a room, I see:

Jicofo 2021-02-06 21:17:12.953 WARNING: [29] org.jitsi.jicofo.bridge.BridgeSelectionStrategy.log() Failed to select initial bridge for participantRegion=null
Jicofo 2021-02-06 21:17:12.954 SEVERE: [29] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Can not invite participant -- no bridge available.
Jicofo 2021-02-06 21:17:12.954 SEVERE: [29] org.jitsi.jicofo.JitsiMeetConferenceImpl.log() Failed to select a bridge for Participant[prospectiveamateursstabcheaply@conference.meet.sandelman.ca/0bba1b0f]@254176509

I could nuke and install the server, but it seems that we are dealing with some kind of rot.

mcr commented 3 years ago

When I restart jicofo, I catch this:

Jicofo 2021-02-06 21:21:27.223 WARNING: [29] org.jitsi.impl.protocol.xmpp.ChatRoomImpl.log() Unable to handle packet: <presence to='focus@auth.meet.sandelman.ca/focus189430232088155' from='jibribrewery@internal.auth.meet.sandelman.ca/focus' id='22aN9-23' type='error'><error type='wait'><internal-server-error xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/></error></presence>
Jicofo 2021-02-06 21:21:27.225 SEVERE: [17] org.jitsi.jicofo.xmpp.BaseBrewery.log() Failed to create room: jvbbrewery@internal.auth.meet.sandelman.ca
net.java.sip.communicator.service.protocol.OperationFailedException: Failed to join the room

maybe I should open a new issue.

damencho commented 3 years ago

Which are the versions you use?

mcr commented 3 years ago

Which are the versions you use?

gimli-[~] mcr 10001 %dpkg --list | grep jitsi
ii  jitsi-meet                           2.0.5390-3                        all          WebRTC JavaScript video conferences
ii  jitsi-meet-prosody                   1.0.4628-1                        all          Prosody configuration for Jitsi Meet
ii  jitsi-meet-turnserver                1.0.4628-1                        all          Configures coturn to be used with Jitsi Meet
ii  jitsi-meet-web                       1.0.4628-1                        all          WebRTC JavaScript video conferences
ii  jitsi-meet-web-config                1.0.4628-1                        all          Configuration for web serving of Jitsi Meet
ii  jitsi-videobridge2                   2.1-416-g2f43d1b4-1               all          WebRTC compatible Selective Forwarding Unit (SFU)
damencho commented 3 years ago

That is strange, maybe there is a prosody error you can check. Other than that maybe stopping the bridge and restarting jicofo and then starting the bridge will fix it, but I'm not sure what is the error.

damencho commented 3 years ago

Is your config in prosody like this: https://github.com/jitsi/jitsi-meet/blob/master/doc/debian/jitsi-meet-prosody/prosody.cfg.lua-jvb.example#L67

Findus23 commented 3 years ago

I had the exact same issue as @mcr (could not start call, "no bride available", "Failed to create room: jvbbrewery...")

I blamed it on the fact that I recently upgraded my debian server to bullseye and haven't tried to use Jitsi since, but maybe it is unrelated. I edited my /etc/prosody/conf.avail/call.example.com.cfg.lua and added those two lines, restarted jicofo.service and since then everything seems to work again. https://github.com/jitsi/jitsi-meet/blob/e525c2b2ec1abcab6b9233720ba4e1ead3d96a14/doc/debian/jitsi-meet-prosody/prosody.cfg.lua-jvb.example#L73-L74

EasyNetDev commented 7 months ago

Hi,

I'm facing the same behavior as soon another user is entering in the meeting. Each time jifoco says:

Jicofo 2024-02-26 16:43:37.540 INFO: [64] [room=testconf@conference.jitsi.MYDOMAIN.dev meeting_id=bf29d8e7-7cb5-4034-a9ed-6c44d4c66645] JitsiMeetConferenceImpl.inviteChatMember#800: Creating participant c670aa65 with features=[XMPP_CAPS, JSON_SOURCES, RECEIVE_MULTIPLE_STREAMS, REMB, SOURCE_NAMES, TCC, VISITORS_V1, E2EE, RTX, JINGLE, DTLS, JINGLE_RTP, AUDIO, VIDEO, SCTP, JINGLE_ICE, RAYO]
Jicofo 2024-02-26 16:43:37.540 INFO: [70] [room=testconf@conference.jitsi.MYDOMAIN.dev meeting_id=bf29d8e7-7cb5-4034-a9ed-6c44d4c66645] ColibriV2SessionManager.allocate#278: Allocating for 97861a20
Jicofo 2024-02-26 16:43:37.540 WARNING: [70] BridgeSelector.selectBridge#182: There are no operational bridges.
Jicofo 2024-02-26 16:43:37.540 SEVERE: [70] [room=testconf@conference.jitsi.MYDOMAIN.dev meeting_id=bf29d8e7-7cb5-4034-a9ed-6c44d4c66645 participant=97861a20] ParticipantInviteRunnable.doRun#218: Can not invite participant, no bridge available.
Jicofo 2024-02-26 16:43:37.541 INFO: [74] [room=testconf@conference.jitsi.MYDOMAIN.dev meeting_id=bf29d8e7-7cb5-4034-a9ed-6c44d4c66645] ColibriV2SessionManager.allocate#278: Allocating for c670aa65
Jicofo 2024-02-26 16:43:37.541 WARNING: [74] BridgeSelector.selectBridge#182: There are no operational bridges.
Jicofo 2024-02-26 16:43:37.543 SEVERE: [74] [room=testconf@conference.jitsi.MYDOMAIN.dev meeting_id=bf29d8e7-7cb5-4034-a9ed-6c44d4c66645 participant=c670aa65] ParticipantInviteRunnable.doRun#218: Can not invite participant, no bridge available.
Jicofo 2024-02-26 16:43:37.773 INFO: [72] [room=testconf@conference.jitsi.MYDOMAIN.dev meeting_id=bf29d8e7-7cb5-4034-a9ed-6c44d4c66645] JitsiMeetConferenceImpl.onMemberLeft#942: Member left:c670aa65
Jicofo 2024-02-26 16:43:37.773 INFO: [72] [room=testconf@conference.jitsi.MYDOMAIN.dev meeting_id=bf29d8e7-7cb5-4034-a9ed-6c44d4c66645] JitsiMeetConferenceImpl.terminateParticipant#1021: Terminating c670aa65, reason: gone, send session-terminate: false

Checking the number of files opened:

# lsof | grep jvb | wc -l
9538
# lsof | grep jvb | grep "UDP" |wc -l
102

# dpkg -l | grep jitsi
ii  jitsi-meet                    2.0.9258-1                     all          WebRTC JavaScript video conferences
ii  jitsi-meet-prosody            1.0.7790-1                     all          Prosody configuration for Jitsi Meet
ii  jitsi-meet-turnserver         1.0.7790-1                     all          Configures coturn to be used with Jitsi Meet
ii  jitsi-meet-web                1.0.7790-1                     all          WebRTC JavaScript video conferences
ii  jitsi-meet-web-config         1.0.7790-1                     all          Configuration for web serving of Jitsi Meet
ii  jitsi-videobridge2            2.3-74-ga015be96-1             all          WebRTC compatible Selective Forwarding Unit (SFU)
ii  lua-basexx                    0.4.1-jitsi1                   all          baseXX encoding/decoding library for Lua
ii  lua-cjson:amd64               2.1.0.10-jitsi1                amd64        JSON parser/encoder for Lua

I have no clue where to look for this issue.

damienvancouver commented 6 months ago

@EasyNetDev I also had this problem out of the blue, after a power outage of all things. Everything was working fine before reboot, but afterwards as soon as another participant would join the first user would be kicked with "BridgeSelector.selectBridge#182: There are no operational bridges."

Unsure of what could have changed, I reviewed the config files (those all were the same, as expected). Then I tried upgrading the Jitsi packages from the apt repository, this upgraded me to jitsti-meet 2.0.9364-1 (from my old 2.0.9528-1).

After that, everything was working again! So it was some kind of bug in 2.0.9528-1 that did not trigger until after the system was rebooted (weird?!). Try a package upgrade, it might be all that's needed to fix it.