isapir / lucee-websocket

Enables server WebSockets for Lucee via JSR-356 compliant servlet containers (e.g. Tomcat 8, Jetty 9.1, etc.)
GNU Lesser General Public License v2.1
17 stars 6 forks source link

Connection mgr throws exception when broadcasting to closed connections #11

Closed redtopia closed 6 years ago

redtopia commented 6 years ago

I'm running into an issue where the connection manager is throwing an exception when trying to broadcast to closed connections. After this happens, calling the broadcast() method of the connection manager will always generate an error until tomcat is restarted.

Here's a dump of the error: error-2017-08-25-11-14-29-2046699486860768.zip

Here's my modified chat app that I'm using for testing, which is throwing the exceptions (note: this error is not exclusive to this app): chat.zip

redtopia commented 6 years ago

Seems like the exception aborts broadcasting, so broadcasting basically breaks until the closed sockets are no longer seen by the broadcast() code.

I might be running into this problem because in my example code, onOpen() returns false when the user is not logged in. Returning false prevents the socket from opening, which might leave it closed but not removed from the list of connections. Clue?

redtopia commented 6 years ago

Also, in order to continue working, I restart Lucee (on Windows by restarting the Lucee Service) in order to hopefully clear out the closed websocket connections. However, doing so doesn't always clear this error. Sometimes I need to restart Lucee multiple times, and it's unclear why. But, this may be a clue... its the tail of my catalina log file after stopping the Lucee Service. Looks like there are a couple threads that could be created by the websocket extension that don't stop:

27-Aug-2017 15:16:11.005 INFO [Thread-111] org.apache.catalina.core.StandardService.stopInternal Stopping service Catalina 27-Aug-2017 15:16:11.009 INFO [local.chat.com-startStop-2] org.apache.catalina.core.ApplicationContext.log org.tuckey.web.filters.urlrewrite.UrlRewriteFilter INFO: destroy called 27-Aug-2017 15:16:11.068 WARNING [local.chat.com-startStop-2] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [ROOT] appears to have started a thread named [FelixDispatchQueue] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread: java.lang.Object.wait(Native Method) java.lang.Object.wait(Object.java:502) org.apache.felix.framework.util.EventDispatcher.run(EventDispatcher.java:1118) org.apache.felix.framework.util.EventDispatcher.access$000(EventDispatcher.java:55) org.apache.felix.framework.util.EventDispatcher$1.run(EventDispatcher.java:102) java.lang.Thread.run(Thread.java:745) 27-Aug-2017 15:16:11.069 WARNING [local.chat.com-startStop-2] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [ROOT] appears to have started a thread named [FelixFrameworkWiring] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread: java.lang.Object.wait(Native Method) java.lang.Object.wait(Object.java:502) org.apache.felix.framework.FrameworkWiringImpl.run(FrameworkWiringImpl.java:172) java.lang.Thread.run(Thread.java:745) 27-Aug-2017 15:16:11.070 WARNING [local.chat.com-startStop-2] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [ROOT] appears to have started a thread named [FelixStartLevel] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread: java.lang.Object.wait(Native Method) java.lang.Object.wait(Object.java:502) org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:283) java.lang.Thread.run(Thread.java:745) 27-Aug-2017 15:16:11.143 INFO [Thread-111] org.apache.coyote.AbstractProtocol.stop Stopping ProtocolHandler ["http-nio-8888"] 27-Aug-2017 15:16:11.143 INFO [Thread-111] org.apache.coyote.AbstractProtocol.stop Stopping ProtocolHandler ["ajp-nio-8009"]

isapir commented 6 years ago

This looks like something that Lucee starts, regardless of whether the WebSocket extension is used or not. See if you get a similar error on an Application that doesn't use web sockets.

isapir commented 6 years ago

Stack trace from zip file:

lucee.runtime.exp.NativeException: The WebSocket session [1] has been closed and no method (apart from close()) may be called on a closed session at org.apache.tomcat.websocket.WsSession.checkState(WsSession.java:782) at org.apache.tomcat.websocket.WsSession.getUserProperties(WsSession.java:727) at net.twentyonesolutions.lucee.websocket.WebSocket.getChannels(WebSocket.java:456) at net.twentyonesolutions.lucee.websocket.connections.ConnectionManager.unsubscribeAll(ConnectionManager.java:193) at net.twentyonesolutions.lucee.websocket.connections.ConnectionManager.broadcast(ConnectionManager.java:178) at net.twentyonesolutions.lucee.websocket.connections.ConnectionManager.call(ConnectionManager.java:393) at lucee.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(VariableUtilImpl.java:758) at lucee.runtime.PageContextImpl.getFunction(PageContextImpl.java:1693) at chatlistener_cfc$cf.udfCall(/ChatListener.cfc:114) at lucee.runtime.type.UDFImpl.implementation(UDFImpl.java:107) at lucee.runtime.type.UDFImpl._call(UDFImpl.java:355) at lucee.runtime.type.UDFImpl.call(UDFImpl.java:226) at lucee.runtime.ComponentImpl._call(ComponentImpl.java:662) at lucee.runtime.ComponentImpl._call(ComponentImpl.java:580) at lucee.runtime.ComponentImpl.call(ComponentImpl.java:1918) at lucee.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(VariableUtilImpl.java:758) at lucee.runtime.PageContextImpl.getFunction(PageContextImpl.java:1693) at chatlistener_cfc$cf.udfCall(/ChatListener.cfc:29) at lucee.runtime.type.UDFImpl.implementation(UDFImpl.java:107) at lucee.runtime.type.UDFImpl._call(UDFImpl.java:355) at lucee.runtime.type.UDFImpl.callWithNamedValues(UDFImpl.java:212) at lucee.runtime.ComponentImpl._call(ComponentImpl.java:663) at lucee.runtime.ComponentImpl._call(ComponentImpl.java:580) at lucee.runtime.ComponentImpl.callWithNamedValues(ComponentImpl.java:1935) at net.twentyonesolutions.lucee.app.LuceeAppListener.invokeWithNamedArgs(LuceeAppListener.java:118) at net.twentyonesolutions.lucee.websocket.WebsocketUtil.invokeListenerMethodWithNamedArgs(WebsocketUtil.java:77) at net.twentyonesolutions.lucee.websocket.WebsocketUtil.invokeListenerMethodWithNamedArgs(WebsocketUtil.java:103) at net.twentyonesolutions.lucee.websocket.LuceeEndpoint.onOpen(LuceeEndpoint.java:57) at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.init(WsHttpUpgradeHandler.java:127) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:705) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1520) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1476) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalStateException: The WebSocket session [1] has been closed and no method (apart from close()) may be called on a closed session ... 36 more

isapir commented 6 years ago

This is fixed in version 2.0.2

chewymole commented 5 years ago

Hi I am on tomcat / linux / lucee 5 with websocks 2.0.3 I have 2 issues.

how do you stop users from subscribing to the same channel multiple times?
when this happens, the older sessions will cause the error above because they have timed out, no messages can be sent to the live sessions in that same channel. 

i register the socket in the application startup, then on each page request a user who is logged into my app subscribes to a user specific channel (/ws/support/{userid}). The problem i am running into is that same user will get subscribed an infinite number of times. I am not sure how to stop duplicate subscriptions.

Thanks in advance

isapir commented 5 years ago

The problem i am running into is that same user will get subscribed an infinite number of times. I am not sure how to stop duplicate subscriptions.

@chewymole - you can maintain a Struct in the Application scope that will keep track of the userid values and close old connections upon new ones, but what if a user opens another tab, for example? Be sure that that's what you want to do first.

psarin commented 2 years ago

Seem to be having the same issue as @redtopia wrt trying to broadcast to closed connections. On Tomcat 9.0.56, Lucee Lucee 5.3.8.206 , and websocket extension 2.0.3. Any thoughts @isapir ?

psarin commented 2 years ago

In debugging a bit more, I tried:

  1. Upgrading Tomcat to 9.0.56. This did not help.
  2. Determined that error is occurring when client browser navigates away from page and it is trying to close the websocket. If server tries to send a message during this time, the error message occurs.

Here is a stack trace:

lucee.runtime.exp.NativeException: The WebSocket session [3] has been closed and no method (apart from close()) may be called on a closed session
  at org.apache.tomcat.websocket.WsSession.checkState(WsSession.java:1074)
  at org.apache.tomcat.websocket.WsSession.getUserProperties(WsSession.java:982)
  at net.twentyonesolutions.lucee.websocket.WebSocket.getStruct(WebSocket.java:498)
  at net.twentyonesolutions.lucee.websocket.WebSocket.toDumpData(WebSocket.java:561)
  at lucee.runtime.dump.DumpUtil.toDumpData(DumpUtil.java:289)
  at lucee.runtime.dump.DumpUtil.toDumpData(DumpUtil.java:336)
  at lucee.runtime.type.util.StructUtil.toDumpTable(StructUtil.java:169)
  at lucee.runtime.type.StructImpl.toDumpData(StructImpl.java:107)
  at lucee.runtime.dump.DumpUtil.toDumpData(DumpUtil.java:289)
  at lucee.runtime.functions.other.DumpStruct.call(DumpStruct.java:93)
  at lucee.runtime.functions.other.DumpStruct.call(DumpStruct.java:82)
  at dump_cfc$cf.udfCall1(/Dump.cfc:110)
  at dump_cfc$cf.udfCall(/Dump.cfc)
  at lucee.runtime.type.UDFImpl.implementation(UDFImpl.java:106)
  at lucee.runtime.type.UDFImpl._call(UDFImpl.java:344)
  at lucee.runtime.type.UDFImpl.callWithNamedValues(UDFImpl.java:207)
  at lucee.runtime.ComponentImpl._call(ComponentImpl.java:685)
  at lucee.runtime.ComponentImpl._call(ComponentImpl.java:572)
  at lucee.runtime.ComponentImpl.callWithNamedValues(ComponentImpl.java:1930)
  at lucee.runtime.tag.CFTag.cfcStartTag(CFTag.java:384)
  at lucee.runtime.tag.CFTag.doStartTag(CFTag.java:178)
  at writedump_cfm$cf.udfCall(/writeDump.cfm:25)
  at lucee.runtime.type.UDFImpl.implementation(UDFImpl.java:106)
  at lucee.runtime.type.UDFImpl._call(UDFImpl.java:344)
  at lucee.runtime.type.UDFImpl.callWithNamedValues(UDFImpl.java:207)
  at lucee.runtime.functions.system.CFFunction.call(CFFunction.java:109)
  at server.tests.debugws_cfm$cf.call(/orrms/server/tests/debugWS.cfm:18)
  at lucee.runtime.PageContextImpl._doInclude(PageContextImpl.java:1034)
  at lucee.runtime.PageContextImpl._doInclude(PageContextImpl.java:926)
  at lucee.runtime.listener.ModernAppListener._onRequest(ModernAppListener.java:217)
  at lucee.runtime.listener.MixedAppListener.onRequest(MixedAppListener.java:44)
  at lucee.runtime.PageContextImpl.execute(PageContextImpl.java:2460)
  at lucee.runtime.PageContextImpl._execute(PageContextImpl.java:2450)
  at lucee.runtime.PageContextImpl.executeCFML(PageContextImpl.java:2421)
  at lucee.runtime.engine.Request.exe(Request.java:45)
  at lucee.runtime.engine.CFMLEngineImpl._service(CFMLEngineImpl.java:1179)
  at lucee.runtime.engine.CFMLEngineImpl.serviceCFML(CFMLEngineImpl.java:1125)
  at lucee.loader.engine.CFMLEngineWrapper.serviceCFML(CFMLEngineWrapper.java:97)
  at lucee.loader.servlet.CFMLServlet.service(CFMLServlet.java:51)
  at javax.servlet.http.HttpServlet.service(HttpServlet.java:764)
  at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227)
  at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
  at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
  at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
  at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
  at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:197)
  at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)
  at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:540)
  at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:135)
  at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
  at mod_cfml.core.invoke(core.java:180)
  at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:687)
  at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)
  at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:357)
  at org.apache.coyote.ajp.AjpProcessor.service(AjpProcessor.java:433)
  at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
  at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:895)
  at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1732)
  at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
  at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191)
  at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
  at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
  at java.base/java.lang.Thread.run(Unknown Source)
 Caused by: java.lang.IllegalStateException: The WebSocket session [3] has been closed and no method (apart from close()) may be called on a closed session
  ... 63 more