isapir / lucee-websocket

Enables server WebSockets for Lucee via JSR-356 compliant servlet containers (e.g. Tomcat 8, Jetty 9.1, etc.)
GNU Lesser General Public License v2.1
17 stars 6 forks source link

Null sessionScope argument being sent to onOpen Listener Component callback #10

Open redtopia opened 6 years ago

redtopia commented 6 years ago

I've noticed this happening intermittently, but now it's consistent. When I dump the Arguments scope of my onOpen callback, I'm getting a null sessionScope.

image

Why would this be happening, and how can I debug?

isapir commented 6 years ago

Are you sure that a Session is created for that cookie.cfid with that Application that is in applicationScope.applicationName?

redtopia commented 6 years ago

The cfid cookie being sent in the websocket request header is different from the user's Session.cfid.

isapir commented 6 years ago

Not sure what you mean? They should be the same.

Do you mean that you are sending different ones on purpose or that you are trying to send the same one but somehow upon inspection you found that they are different?

In any event, that would explain your issue. The extension has no way of knowing the Session.cfid on its own. It takes the Cookie.cfid from the incoming headers when the handshake is created, and tries to retrieve from Lucee the combination of Application.applicationName and Cookie.cfid. If you are passing a cfid and a Session for it does not exist with that applicationName then you get null.

Let's keep questions/answers in the Lucee dev forum and only open tickets after confirming a bug/issue.

redtopia commented 6 years ago

I'm not manually setting headers, and I'm just noticing the difference in the cfid values. Could be related to logging in and logging out, and attempting to reset my session. I created a post: https://dev.lucee.org/t/websockets-extension-get-correct-session-scope/2694

redtopia commented 6 years ago

Igal, it's been a while since I got into my websocket testing. I just pushed websockets using your extension to production using the latest lucee release and the latest websocket release. This is still an issue, though it's difficult to reproduce. Seems like the extension is not finding the current session, though normal, non-websocket requests are working fine... the session is found.

First thing, could your code check for a null session before passing it into the onOpen method? Or, should I be doing that, in which case my code would assume that there is no session and would refuse the connection?

Second thing, since the tomcat websocket should be finding the session, what can I do to help figure out what's going on? Here's some of the logging information that is relevant:

From websocket.log:

"DEBUG","http-nio-8888-exec-8","05/17/2018","23:15:57","websocket","HandshakeHandler.modifyHandshake(); C:\Dev\MusicLessons\apps\musiclessons-web02\www\@/ws/user/{channel}; cfid: 8dc1da9a-a56c-4436-99e8-623e72e7180f"
"DEBUG","http-nio-8888-exec-8","05/17/2018","23:15:57","websocket","calling listener.onHandshake()"
"DEBUG","http-nio-8888-exec-8","05/17/2018","23:15:57","websocket","connection ea enter onOpen()"
"DEBUG","http-nio-8888-exec-8","05/17/2018","23:15:57","websocket","connection ea calling listener.onOpen()"
"DEBUG","http-nio-8888-exec-8","05/17/2018","23:15:57","websocket","calling listener.onOpen()"
"DEBUG","http-nio-8888-exec-9","05/17/2018","23:15:57","websocket","HandshakeHandler.modifyHandshake(); C:\Dev\MusicLessons\apps\musiclessons-web02\www\@/ws/messages/thread/{channel}; cfid: 8dc1da9a-a56c-4436-99e8-623e72e7180f"
"DEBUG","http-nio-8888-exec-9","05/17/2018","23:15:57","websocket","calling listener.onHandshake()"
"DEBUG","http-nio-8888-exec-9","05/17/2018","23:15:57","websocket","connection eb enter onOpen()"
"DEBUG","http-nio-8888-exec-9","05/17/2018","23:15:57","websocket","connection eb calling listener.onOpen()"
"DEBUG","http-nio-8888-exec-9","05/17/2018","23:15:57","websocket","calling listener.onOpen()"
"INFO","http-nio-8888-exec-9","05/17/2018","23:15:57","websocket","connection eb listener.onOpen() refused the connection by returning false. closing connection."
"DEBUG","http-nio-8888-exec-9","05/17/2018","23:15:57","websocket","connection eb enter onClose(); CloseReason: code [1000], reason []"
"DEBUG","http-nio-8888-exec-9","05/17/2018","23:15:57","websocket","calling listener.onClose()"
"INFO","http-nio-8888-exec-8","05/17/2018","23:15:58","websocket","connection ea listener.onOpen() refused the connection by returning false. closing connection."
"DEBUG","http-nio-8888-exec-8","05/17/2018","23:15:58","websocket","connection ea enter onClose(); CloseReason: code [1000], reason []"
"DEBUG","http-nio-8888-exec-8","05/17/2018","23:15:58","websocket","calling listener.onClose()"
"DEBUG","http-nio-8888-exec-8","05/17/2018","23:15:58","websocket","listener.onChannelOpen() is not implemented"
"INFO","http-nio-8888-exec-8","05/17/2018","23:15:58","websocket","channel usr_developer opened"
"DEBUG","Timer-8","05/17/2018","23:16:21","websocket","listener.onUnsubscribe() is not implemented"
"INFO","Timer-8","05/17/2018","23:16:21","websocket","ea unsubscribed from usr_developer"
"DEBUG","Timer-8","05/17/2018","23:16:21","websocket","listener.onChannelClose() is not implemented"
"INFO","Timer-8","05/17/2018","23:16:21","websocket","channel usr_developer closed"
"DEBUG","Timer-5","05/17/2018","23:16:21","websocket","listener.onUnsubscribe() is not implemented"
"INFO","Timer-5","05/17/2018","23:16:21","websocket","eb unsubscribed from 1321"

From my own logging - stack trace for the exception when trying to use the null session scope passed into the onOpen() method:

lucee.runtime.exp.ExpressionException: The parameter sessionScope to function getUserID is required but was not passed in. 
at lucee.runtime.type.UDFImpl.defineArguments(UDFImpl.java:133) 
at lucee.runtime.type.UDFImpl._call(UDFImpl.java:354) 
at lucee.runtime.type.UDFImpl.call(UDFImpl.java:226) 
at lucee.runtime.type.scope.UndefinedImpl.call(UndefinedImpl.java:771) 
at lucee.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(VariableUtilImpl.java:756)
at lucee.runtime.PageContextImpl.getFunction(PageContextImpl.java:1716) 
at websockets.msgs_thread_cfc$cf.udfCall(/cfusion/websockets/msgs_thread.cfc) 
at lucee.runtime.type.UDFImpl.implementation(UDFImpl.java:107) 
at lucee.runtime.type.UDFImpl._call(UDFImpl.java:357) 
at lucee.runtime.type.UDFImpl.callWithNamedValues(UDFImpl.java:212) 
at lucee.runtime.ComponentImpl._call(ComponentImpl.java:659) 
at lucee.runtime.ComponentImpl._call(ComponentImpl.java:573) 
at lucee.runtime.ComponentImpl.callWithNamedValues(ComponentImpl.java:2014) 
at net.twentyonesolutions.lucee.app.LuceeAppListener.invokeWithNamedArgs(LuceeAppListener.java:118)
at net.twentyonesolutions.lucee.websocket.WebsocketUtil.invokeListenerMethodWithNamedArgs(WebsocketUtil.java:77)
at net.twentyonesolutions.lucee.websocket.WebsocketUtil.invokeListenerMethodWithNamedArgs(WebsocketUtil.java:103)
at net.twentyonesolutions.lucee.websocket.LuceeEndpoint.onOpen(LuceeEndpoint.java:57) 
at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.init(WsHttpUpgradeHandler.java:127)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:705)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1520) 
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1476) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) 
at java.lang.Thread.run(Thread.java:745) 

And the code where the exception occurs:

    public function onOpen(websocket, endpointConfig, sessionScope, applicationScope) {
        try {
            var idUser = getUserID(Arguments.sessionScope);
            ....
redtopia commented 6 years ago

One thing that comes to mind is that this seems to occur after sessions have been terminated, followed by a new one being created. This happens when the user walks away from the computer and comes back after the session timeout setting. A refresh of the page will create a new session.

silashatfield commented 6 years ago

I am having a similar issue but in my case, I am using WSS secured sockets. It works in both Firefox and Google Chrome, but for IE Edge the session is null. I've done as much debugging as possible, and I know that for IE Edge the cookie is different from the application. Before I setup the HttpSessionInitializerFilter, the same thing was happening for all browsers, so I was thinking it might be related to HttpSessionInitializerFilter and IE Edge.

redtopia commented 6 years ago

I'm also using wss, and I'm seeing this in Chrome. I haven't tested with Edge. In my limited testing, FF seems okay.

Also, when this condition occurs, the session will never be found, even if I completely quit the browser and open it again. The condition appears to occur in perpetuity - the websocket extension will never find the session, even after many days.

redtopia commented 6 years ago

A few more observations and clarifications:

redtopia commented 6 years ago

Any movement on this Igal? Timeframe?

isapir commented 6 years ago

I'm sorry. Been extremely busy lately and no time frame ATM.

silashatfield commented 6 years ago

isapir, any tips on getting this project loaded up for debugging? i would like to help out, but not really sure on where to start with testing this project. IE, would I need to download the tomcat project and step into it to be able to hit your code?

redtopia commented 6 years ago

@silashatfield I would be happy to help from my Lucee app. I've got Fusion Reactor running and can dig into it as much as needed, and I can reproduce the issue quite easily.

redtopia commented 5 years ago

I may have stumbled upon the root problem with NULL session arguments. I was running a test with 2 browsers. One browser connected to the websocket channel fine, and the other didn't. In the browser that worked, the onOpen() method of my listener was being sent a NULL session argument, which was causing the onOpen method to return false.

Looking deeper, in the browser that did work, there was only one cfid cookie, but in the browser that didn't work, there were 2 cfid cookies - one was a domain cookie .myapp.com, and the other was a subdomain local.myapp.com cookie.

I believe this was happening because I had this.setDomainCookies=true; in application.cfc (when sessions are enabled). In the browser that didn't work, I had a different tab open that was accessing a separate app that's mapped to the same domain, which seemed to have created a different cfid cookie for that session. The app is different, so I had a different session in the same domain.

After turning off this.setDomainCookies, and clearing my session cookies, the websocket extension was able to find the correct session, and I no longer received a NULL session argument in my hander. Now I don't get domain cookies, so each cfid is found in it's own subdomain.

Would it be possible to enhance the code that attempts to find the correct session to take into consideration domain cookies?

isapir commented 5 years ago

Interesting. I am not sure that the domain is sent with the Request's cookie. Can you check the Request headers when setDomainCookies is enabled vs. disabled?

redtopia commented 5 years ago

Looks like all you get are duplicate cfid cookies without the domain for each one.

When finding a session, perhaps you could you use the Host header to match against each cfid cookie that you encounter. Can you write a log entry when you fail to find a session, and perhaps give a reason why?

I've since set this.setDomainCookies=false; because I don't need to manage sessions across multiple subdomains. If I did, I think I would have to make sure that they all used the same this.name="myapp"; in application.cfc. That way, you would always have just one session. I think the reason why I'm getting multiple sessions is because both subdomains use a different this.name.

BTW, since I had already fixed the duplicate cfid cookie problem in the browser, I had trouble getting back in that state after setting this.setDomainCookies=true; Maybe the domain cookies only get set when the session is created. Either way, I manually created a duplicate cfid cookie to test your question.

silashatfield commented 5 years ago

I added a pull request a few months back to resolve the issue. Browsers are storing the cookie values in different case formats, IE Cfid CFID cfid. When trying to set the session, it does not correctly get the header which makes it null.

redtopia commented 5 years ago

@silashatfield I see your change... but I would expect that further changes would be needed to handle multiple cfid cookies. Thanks for your help on this... at least we know why the session scope is null. Some additional logging could help find problems with session mapping.

redtopia commented 4 years ago

I'm updating this thread because I recently implemented client-side code (in javascript) to close/open connections for various reasons... like if the app is inactive, or is no longer in focus, etc. In testing, I began noticing that sometimes I could not re-open a connection that I had just closed.

I began digging into the logs, and I noticed that the extension was passing in NULL session scopes randomly. Most of the time the socket was successfully re-opened, and sometimes I would get a NULL session.

I eliminated the issue where multiple cfid's may have been causing the extension to not find the correct session, and I verified in the logs that the correct CFID was being used when a NULL session was passed in.

Here's an excerpt from the logs when this happens:

"DEBUG","http-nio-8888-exec-10","07/16/2019","14:13:18","websocket","calling listener.onHandshake()"
"DEBUG","http-nio-8888-exec-10","07/16/2019","14:13:18","websocket","connection 1b enter onOpen()"
"DEBUG","http-nio-8888-exec-10","07/16/2019","14:13:18","websocket","connection 1b calling listener.onOpen()"
"DEBUG","http-nio-8888-exec-10","07/16/2019","14:13:18","websocket","calling listener.onOpen()"
"INFO","http-nio-8888-exec-10","07/16/2019","14:13:18","websocket","connection 1b listener.onOpen() refused the connection by returning false. closing connection."
"DEBUG","http-nio-8888-exec-10","07/16/2019","14:13:18","websocket","connection 1b enter onClose(); CloseReason: code [1000], reason []"
"DEBUG","http-nio-8888-exec-10","07/16/2019","14:13:18","websocket","calling listener.onClose()"
"DEBUG","http-nio-8888-exec-10","07/16/2019","14:13:18","websocket","listener.onChannelOpen() is not implemented"
"INFO","http-nio-8888-exec-10","07/16/2019","14:13:18","websocket","channel usr_developer opened"

Not sure why the logs say that the channel was opened because it most definitely wasn't.

@isapir, I'm wondering what it will take to dig a little deeper into this issue?