Open gjaekel opened 3 years ago
The resource leak is proved to be caused by a bug inside the Colibiri Websocket Proxy: Disabling this component (and fall back to HTTP transmission of the Colibiri data channel) on a running system stops further fh consumption for new connection clients, i.e. the overall level of file handles don't rise anymore on a significant way.
Here some resource graphs for the Linux Container used for Openfire taken by a monitoring system. There are noticeable growings of the fh's at start of conference meetings (typical at full hour) where a lot of clients start to use the service this day. At about 12:30 there was a manual restart of OpenFire due to the near out-of-filehandle situation. At about 14:30 the use of the Colibiri-WS-Proxy was disabled.
As a comparison, here's a snapshot of the next day. The're was a full restart at the evening before and the WS Proxy was kept disabled all the time.
Investigations using VisualVM while having WS Proxy enabled had shown: There are a continouosly rising number of Threads named HttpClient@$ID-$TID
. The minority is showed alternating as parked and running state. This seems to be working right because they will disappear after a while, probably at disconnect of a client.
But the most are 100% at state Running, but seems to futile wait for input at a socket.
Example of thread task dump:
2021-01-07 14:04:28
"HttpClient@2b3c530e-16726" - Thread t@16726
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <3639f9f> (a sun.nio.ch.Util$3)
- locked <a11d1f5> (a java.util.Collections$UnmodifiableSet)
- locked <49bbacd9> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
at org.eclipse.jetty.io.ManagedSelector.nioSelect(ManagedSelector.java:149)
at org.eclipse.jetty.io.ManagedSelector.select(ManagedSelector.java:156)
at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.select(ManagedSelector.java:572)
at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:509)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:360)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:184)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:135)
at org.eclipse.jetty.io.ManagedSelector$$Lambda$164/1911056207.run(Unknown Source)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None
I guess that this caused by instances of the "innerside" http client connection of between the Colibri WS proxy and the Colibiri Endpoint at the JVB2.
Thanks for all the valuable data and debugging. I found that the proxy was not calling the stop method on the created websocket client object. See https://github.com/igniterealtime/openfire-pade-plugin/commit/6c4b9f44f2fc001c194957b000e61ea8b6660bbb
Please try latest jar file and confirm if it makes any difference.
Thank you. Will ASAP try to make an A/B-Test first by activating the debug logger ProxyConnection
because I'm not able to produce a serious load at weekend. But It should be also possible to find and monitor the threads mentioned abouve.
Now, the http clients are alt least closed at the moment a client leave a conference - a big step forward.
But each joining client still create 11 http client threads but seems to use just two of it. Below you'll see a part of the live thread panel from VisualVM for the JVM for JVB while two clients are joined at a conference:
Might there something around like a unused, "lost" and default-sized http-client pool?
Below is the output of the ProxyConnection
Logger for joining two participants and leaving afterwards. Payload output isn't included.
20210110-183218.641 DEBUG [HttpClient@3d436cc5-410] [ProxyConnection] ProxySocket onConnect: WebSocketSession[websocket=JettyAnnotatedEventDriver[org.ifsoft.websockets.ProxyConnection$ProxySocket@6e5d50ce],behavior=CLIENT,connection=WebSocketClientConnection@6a62c9dd::SocketChannelEndPoint@2f605af5{l=/127.0.0.1:58192,r=localhost/127.0.0.1:8180,OPEN,fill=-,flush=-,to=0/300000}{io=0/0,kio=0,kro=1}->WebSocketClientConnection@6a62c9dd[s=ConnectionState@1a5a0a7e[OPENING],f=org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection$Flusher@1e85f2db[IDLE][queueSize=0,aggregateSize=-1,terminated=null],g=Generator[CLIENT,validating],p=Parser@6255fcf7[ExtensionStack,s=START,c=0,len=0,f=null]],remote=WebSocketRemoteEndpoint@4f4fa5ba[batching=true],incoming=JettyAnnotatedEventDriver[org.ifsoft.websockets.ProxyConnection$ProxySocket@6e5d50ce],outgoing=ExtensionStack[queueSize=0,extensions=[],incoming=org.eclipse.jetty.websocket.common.WebSocketSession,outgoing=org.eclipse.jetty.websocket.client.io.WebSocketClientConnection]]
20210110-183218.641 DEBUG [HttpClient@3d436cc5-410] [ProxyConnection] ProxySocket onMessage
20210110-183218.641 DEBUG [HttpClient@3d436cc5-410] [ProxyConnection] ProxyConnection - onMessage
20210110-183218.643 DEBUG [HttpClient@3d436cc5-399] [ProxyConnection] ProxySocket onMessage
20210110-183218.643 DEBUG [HttpClient@3d436cc5-399] [ProxyConnection] ProxyConnection - onMessage
20210110-183218.834 DEBUG [Jetty-QTP-BOSH-133] [ProxyConnection] ProxyConnection - deliver
20210110-183218.834 DEBUG [Jetty-QTP-BOSH-133] [ProxyConnection] ProxySocket deliver:
20210110-183218.835 DEBUG [Jetty-QTP-BOSH-133] [ProxyConnection] ProxyConnection - deliver
20210110-183218.835 DEBUG [Jetty-QTP-BOSH-133] [ProxyConnection] ProxySocket deliver:
20210110-183218.835 DEBUG [Jetty-QTP-BOSH-133] [ProxyConnection] ProxyConnection - deliver
20210110-183218.835 DEBUG [Jetty-QTP-BOSH-133] [ProxyConnection] ProxySocket deliver:
20210110-183218.835 DEBUG [Jetty-QTP-BOSH-133] [ProxyConnection] ProxyConnection - deliver
20210110-183218.835 DEBUG [Jetty-QTP-BOSH-133] [ProxyConnection] ProxySocket deliver:
20210110-183219.228 DEBUG [Jetty-QTP-BOSH-163] [ProxyConnection] ProxyConnection - deliver
20210110-183219.228 DEBUG [Jetty-QTP-BOSH-158] [ProxyConnection] Connecting to : ws://localhost:8180/colibri-ws/ofmeet/3c2ba2a4d1889527/c42387eb?pwd=4r26knh08hg2715dv2brthl3uj
20210110-183219.228 DEBUG [Jetty-QTP-BOSH-163] [ProxyConnection] ProxySocket deliver:
20210110-183219.233 DEBUG [HttpClient@50113581-426] [ProxyConnection] ProxySocket onConnect: WebSocketSession[websocket=JettyAnnotatedEventDriver[org.ifsoft.websockets.ProxyConnection$ProxySocket@510260f7],behavior=CLIENT,connection=WebSocketClientConnection@553dee0f::SocketChannelEndPoint@103f6acc{l=/127.0.0.1:58194,r=localhost/127.0.0.1:8180,OPEN,fill=-,flush=-,to=1/300000}{io=0/0,kio=0,kro=1}->WebSocketClientConnection@553dee0f[s=ConnectionState@6322fbc9[OPENING],f=org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection$Flusher@21653a82[IDLE][queueSize=0,aggregateSize=-1,terminated=null],g=Generator[CLIENT,validating],p=Parser@1b92f537[ExtensionStack,s=START,c=0,len=0,f=null]],remote=WebSocketRemoteEndpoint@70109583[batching=true],incoming=JettyAnnotatedEventDriver[org.ifsoft.websockets.ProxyConnection$ProxySocket@510260f7],outgoing=ExtensionStack[queueSize=0,extensions=[],incoming=org.eclipse.jetty.websocket.common.WebSocketSession,outgoing=org.eclipse.jetty.websocket.client.io.WebSocketClientConnection]]
20210110-183219.233 DEBUG [HttpClient@50113581-426] [ProxyConnection] ProxySocket onMessage
20210110-183219.233 DEBUG [HttpClient@50113581-426] [ProxyConnection] ProxyConnection - onMessage
20210110-183219.236 DEBUG [HttpClient@50113581-415] [ProxyConnection] ProxySocket onMessage
20210110-183219.236 DEBUG [HttpClient@50113581-415] [ProxyConnection] ProxyConnection - onMessage
20210110-183219.238 DEBUG [HttpClient@50113581-415] [ProxyConnection] ProxySocket onMessage
20210110-183219.238 DEBUG [HttpClient@50113581-415] [ProxyConnection] ProxyConnection - onMessage
20210110-183219.270 DEBUG [Jetty-QTP-BOSH-133] [ProxyConnection] ProxyConnection - deliver
20210110-183219.270 DEBUG [Jetty-QTP-BOSH-133] [ProxyConnection] ProxySocket deliver:
20210110-183219.356 DEBUG [HttpClient@3d436cc5-408] [ProxyConnection] ProxySocket onMessage
20210110-183219.357 DEBUG [HttpClient@50113581-424] [ProxyConnection] ProxySocket onMessage
20210110-183219.357 DEBUG [HttpClient@3d436cc5-408] [ProxyConnection] ProxyConnection - onMessage
20210110-183219.357 DEBUG [HttpClient@50113581-424] [ProxyConnection] ProxyConnection - onMessage
20210110-183220.089 DEBUG [Jetty-QTP-BOSH-135] [ProxyConnection] ProxyConnection - deliver
20210110-183220.089 DEBUG [Jetty-QTP-BOSH-135] [ProxyConnection] ProxySocket deliver:
20210110-183220.096 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxyConnection - deliver
20210110-183220.096 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxySocket deliver:
20210110-183220.099 DEBUG [Jetty-QTP-BOSH-135] [ProxyConnection] ProxyConnection - deliver
20210110-183220.099 DEBUG [Jetty-QTP-BOSH-135] [ProxyConnection] ProxySocket deliver:
20210110-183220.099 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxyConnection - deliver
20210110-183220.099 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxySocket deliver:
20210110-183220.416 DEBUG [Jetty-QTP-BOSH-135] [ProxyConnection] ProxyConnection - deliver
20210110-183220.416 DEBUG [Jetty-QTP-BOSH-135] [ProxyConnection] ProxySocket deliver:
20210110-183220.419 DEBUG [HttpClient@3d436cc5-410] [ProxyConnection] ProxySocket onMessage
20210110-183220.419 DEBUG [HttpClient@3d436cc5-410] [ProxyConnection] ProxyConnection - onMessage
20210110-183220.433 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxyConnection - deliver
20210110-183220.433 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxySocket deliver:
20210110-183220.435 DEBUG [HttpClient@3d436cc5-408] [ProxyConnection] ProxySocket onMessage
20210110-183220.435 DEBUG [HttpClient@3d436cc5-408] [ProxyConnection] ProxyConnection - onMessage
20210110-183220.643 DEBUG [Jetty-QTP-BOSH-158] [ProxyConnection] ProxyConnection - deliver
20210110-183220.643 DEBUG [Jetty-QTP-BOSH-158] [ProxyConnection] ProxySocket deliver:
20210110-183220.646 DEBUG [HttpClient@50113581-426] [ProxyConnection] ProxySocket onMessage
20210110-183220.646 DEBUG [HttpClient@50113581-426] [ProxyConnection] ProxyConnection - onMessage
20210110-183220.672 DEBUG [Jetty-QTP-BOSH-164] [ProxyConnection] ProxyConnection - deliver
20210110-183220.673 DEBUG [Jetty-QTP-BOSH-164] [ProxyConnection] ProxySocket deliver:
20210110-183228.025 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxyConnection - deliver
20210110-183228.025 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxySocket deliver:
20210110-183228.028 DEBUG [HttpClient@50113581-424] [ProxyConnection] ProxySocket onMessage
20210110-183228.028 DEBUG [HttpClient@50113581-424] [ProxyConnection] ProxyConnection - onMessage
20210110-183228.049 DEBUG [Jetty-QTP-BOSH-158] [ProxyConnection] ProxyConnection - deliver
20210110-183228.049 DEBUG [Jetty-QTP-BOSH-158] [ProxyConnection] ProxySocket deliver:
20210110-183228.051 DEBUG [HttpClient@3d436cc5-410] [ProxyConnection] ProxySocket onMessage
20210110-183228.052 DEBUG [HttpClient@3d436cc5-410] [ProxyConnection] ProxyConnection - onMessage
20210110-183228.172 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxyConnection - deliver
20210110-183228.172 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxySocket deliver:
20210110-183228.174 DEBUG [HttpClient@3d436cc5-408] [ProxyConnection] ProxySocket onMessage
20210110-183228.174 DEBUG [HttpClient@3d436cc5-408] [ProxyConnection] ProxyConnection - onMessage
20210110-183228.193 DEBUG [Jetty-QTP-BOSH-135] [ProxyConnection] ProxyConnection - deliver
20210110-183228.193 DEBUG [Jetty-QTP-BOSH-135] [ProxyConnection] ProxySocket deliver:
20210110-183228.196 DEBUG [HttpClient@50113581-426] [ProxyConnection] ProxySocket onMessage
20210110-183228.196 DEBUG [HttpClient@50113581-426] [ProxyConnection] ProxyConnection - onMessage
20210110-183238.042 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxyConnection - deliver
20210110-183238.042 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxySocket deliver:
20210110-183238.045 DEBUG [HttpClient@50113581-424] [ProxyConnection] ProxySocket onMessage
20210110-183238.045 DEBUG [HttpClient@50113581-424] [ProxyConnection] ProxyConnection - onMessage
20210110-183238.065 DEBUG [Jetty-QTP-BOSH-158] [ProxyConnection] ProxyConnection - deliver
20210110-183238.066 DEBUG [Jetty-QTP-BOSH-158] [ProxyConnection] ProxySocket deliver:
20210110-183238.067 DEBUG [HttpClient@3d436cc5-410] [ProxyConnection] ProxySocket onMessage
20210110-183238.068 DEBUG [HttpClient@3d436cc5-410] [ProxyConnection] ProxyConnection - onMessage
20210110-183238.185 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxyConnection - deliver
20210110-183238.185 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxySocket deliver:
20210110-183238.187 DEBUG [HttpClient@3d436cc5-408] [ProxyConnection] ProxySocket onMessage
20210110-183238.188 DEBUG [HttpClient@3d436cc5-408] [ProxyConnection] ProxyConnection - onMessage
20210110-183238.204 DEBUG [Jetty-QTP-BOSH-167] [ProxyConnection] ProxyConnection - deliver
20210110-183238.204 DEBUG [Jetty-QTP-BOSH-167] [ProxyConnection] ProxySocket deliver:
20210110-183238.205 DEBUG [HttpClient@50113581-426] [ProxyConnection] ProxySocket onMessage
20210110-183238.205 DEBUG [HttpClient@50113581-426] [ProxyConnection] ProxyConnection - onMessage
20210110-183238.739 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxyConnection - deliver
20210110-183238.739 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxySocket deliver:
20210110-183238.743 DEBUG [HttpClient@50113581-424] [ProxyConnection] ProxySocket onMessage
20210110-183238.743 DEBUG [HttpClient@50113581-424] [ProxyConnection] ProxyConnection - onMessage
20210110-183239.222 DEBUG [Jetty-QTP-BOSH-158] [ProxyConnection] ProxyConnection - deliver
20210110-183239.222 DEBUG [Jetty-QTP-BOSH-158] [ProxyConnection] ProxySocket deliver:
20210110-183239.225 DEBUG [HttpClient@3d436cc5-410] [ProxyConnection] ProxySocket onMessage
20210110-183239.225 DEBUG [HttpClient@3d436cc5-410] [ProxyConnection] ProxyConnection - onMessage
20210110-183243.946 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxyConnection - disconnect
20210110-183243.946 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxyConnection - sttop
20210110-183243.946 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxySocket onClose 1006 Disconnected
20210110-183243.946 DEBUG [Jetty-QTP-BOSH-162] [ProxyConnection] ProxyConnection - onClose Disconnected 1006
20210110-183244.516 DEBUG [Jetty-QTP-BOSH-167] [ProxyConnection] ProxyConnection - deliver
20210110-183244.516 DEBUG [Jetty-QTP-BOSH-167] [ProxyConnection] ProxySocket deliver:
20210110-183248.657 DEBUG [Jetty-QTP-BOSH-133] [ProxyConnection] ProxyConnection - deliver
20210110-183248.658 DEBUG [Jetty-QTP-BOSH-133] [ProxyConnection] ProxySocket deliver:
20210110-183249.256 DEBUG [Jetty-QTP-BOSH-158] [ProxyConnection] ProxyConnection - disconnect
20210110-183249.257 DEBUG [Jetty-QTP-BOSH-158] [ProxyConnection] ProxyConnection - sttop
20210110-183249.257 DEBUG [Jetty-QTP-BOSH-158] [ProxyConnection] ProxySocket onClose 1006 Disconnected
20210110-183249.257 DEBUG [Jetty-QTP-BOSH-158] [ProxyConnection] ProxyConnection - onClose Disconnected 1006
Maybe the Factory https://github.com/igniterealtime/openfire-pade-plugin/blob/6c4b9f44f2fc001c194957b000e61ea8b6660bbb/ofmeet/src/main/java/org/ifsoft/websockets/ProxyConnection.java#L54 should be declared as a static class singleton and used as a factory, too?
The JavaDoc states:
WebSocketClientFactory contains the common components needed by multiple WebSocketClient instances (for example, a ThreadPool, a NIO selector, etc).
WebSocketClients with different configurations should share the same factory to avoid to waste resources.
If a ThreadPool or MaskGen is passed in the constructor, then it is not added with AggregateLifeCycle.addBean(Object), so it's lifecycle must be controlled externally.
With help of a good friend, it seems that I was able to trace down the core reason.
In the core, one have to replace an automagical pool-sizing mechanism: Because we use a websocket connection over the lifetime of thread, we don't need more than one client.
- HttpClient httpClient = new HttpClient(...);
+ HttpClient httpClient = new HttpClient(new HttpClientTransportOverHTTP(1),...);
With this change, I observe one http-client thread per incoming WebProxy connection.
We're going to prepare a consolidated pull request. In extension to a fix of the core problem, we suggest to also remove deprecated methods and some refactoring like use a "singleton" SSL context factory.
We still have a -- much smaller -- leak of the triple in some, yet unknown cases.
With websocket transfer for Colibiri data channel, there's a serious file handle leak. In groups of three fh's, the total number of sockets is increasing all the time.
Invoking a forced FullGC via JMX don't help out, the resources will never be released during uptime of the OpenFire JVM.