igniterealtime / openfire-pade-plugin

A plugin for Openfire that offers web-based unified communications - chat, groupchat, telephone, audio and video conferencing.
Apache License 2.0
54 stars 29 forks source link

Reverse Websocket Proxy malfunction #446

Open gjaekel opened 1 year ago

gjaekel commented 1 year ago

The reverse websocket proxy running on the OpenFire server, that intermediates between the participant browsers Jitsi Webmeeting instances and the Jitsi Video Bridge, tend to drop backend connections in use. In addition, the corresponding frontend connection isn't terminated. For this reason, the Webmeeting client don't get aware of this and will not use it's recover mechanisms. In the case of a broken backend connection, the client will send into the void and don't receive messages anymore.

The affected communication channel is used between all the participant clients and the bridge to interchange information about available bandwith and used video resolution. Without this, the client will drop down to lowest video quality or stop the video transmissions at all.

gjaekel commented 1 year ago

In general, the proxy works as needed. But very often, websocket connection between the user clients and the JVB become stuck (but not closed at browsers side). By investigation, this is because proxied backend connections are dropped. On our production, I constantly observe much less backend connections than the number of current meeting participants.

And this seems to caused by a general code design problem with threading, because "wrong" connections are dropped in moment that others are closed down. In addition, this failure don't lead to a closedown of the frondend websocket connection: Messages from the user clients are still accepted and this side don't get aware of the problem. If this external connection would die in the same moment at least, the client will probably detect and recover the communication.

The typical related exception seems to be

20230102-142035.231 WARN  [oxyConnection-HttpClient-3681991] [o.e.j.u.t.s.EatWhatYouKill]
java.util.concurrent.RejectedExecutionException: CEP:SocketChannelEndPoint@5e98fcdc{l=/10.10.1.137:47392,r=/10.10.1.137:8180,OPEN,fill=FI,flush=-,to=1541/300000}{io=1/0,kio=1,kro=1}->WebSocketClientConnection@7f78da31[s=ConnectionState@4e4da285[OPENED],f=Flusher@2647dad5[IDLE][queueSize=0,aggregateSize=-1,terminate>
        at org.eclipse.jetty.util.thread.QueuedThreadPool.execute(QueuedThreadPool.java:693) ~[jetty-util-9.4.43.v20210629.jar:9.4.43.v20210629]
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.execute(EatWhatYouKill.java:375) [jetty-util-9.4.43.v20210629.jar:9.4.43.v20210629]
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) [jetty-util-9.4.43.v20210629.jar:9.4.43.v20210629]
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173) [jetty-util-9.4.43.v20210629.jar:9.4.43.v20210629]
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) [jetty-util-9.4.43.v20210629.jar:9.4.43.v20210629]
        at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:386) [jetty-util-9.4.43.v20210629.jar:9.4.43.v20210629]
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883) [jetty-util-9.4.43.v20210629.jar:9.4.43.v20210629]
        at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034) [jetty-util-9.4.43.v20210629.jar:9.4.43.v20210629]
        at java.lang.Thread.run(Thread.java:833) [?:?]
20230102-142035.232 DEBUG [Thread-16] [o.j.o.p.o.JitsiJvbWrapper] SEVERE: [123049] [confId=e2ca7e7f4f192f56 [conf_name=haushaltsreferat@conference.appr.xmpp.dnb.de](mailto:conf_name=haushaltsreferat@conference.appr.xmpp.dnb.de) epId=c41abc46 stats_id=Terry-6OC] EndpointMessageTransport.webSocketError#382: Colibri websocket error: null
20230102-142052.392 DEBUG [Thread-16] [o.j.o.p.o.JitsiJvbWrapper] INFO: [123006] [confId=e2ca7e7f4f192f56 [conf_name=haushaltsreferat@conference.appr.xmpp.dnb.de](mailto:conf_name=haushaltsreferat@conference.appr.xmpp.dnb.de) epId=c41abc46 stats_id=Terry-6OC] TlsServerImpl.notifyAlertReceived#245: close_notify received, connection closing

Maybe the code isn't thread-save, have other race conditions or don't use library calls in the right way or with wrong/missing parameters.

deleolajide commented 1 year ago

Maybe the code isn't thread-save, have other race conditions or don't use library calls in the right way or with wrong/missing parameters.

I totally agree with you. It does not help that we are using an old end-of-life version of Jetty websockets in Openfire. I really want to move Jetty to latest code because of HTTP3 as soon as possible.