hazelcast / hazelcast-tomcat-sessionmanager

Tomcat Based Web Session Replication
Other
33 stars 37 forks source link

HazelcastSessionManager.findSession can return null for an existing session under load #42

Closed gokhanoner closed 5 years ago

gokhanoner commented 6 years ago

I believe the issue is in this method: https://github.com/hazelcast/hazelcast-tomcat-sessionmanager/blob/master/tomcat8/src/main/java/com/hazelcast/session/HazelcastSessionManager.java#L226

from findSession method

HazelcastSession hazelcastSession = sessionMap.get(id);
if (hazelcastSession == null) {
  log.info("No Session found for:" + id);
  return null;
}
.....
// call remove method to trigger eviction Listener on each node to invalidate local sessions
sessionMap.remove(id);
sessionMap.set(id, hazelcastSession);

After checking sessionMap, if the session is present, first remove & then set called. After remove & before set call, if you call the same method, you can get null from sessionMap.get(id) call. This could cause users getting null for an existing session under load, as described in https://github.com/hazelcast/hazelcast-tomcat-sessionmanager/pull/41

edwardsmatt commented 6 years ago

Hi @gokhanoner,

I've put together a basic test case able to reproduce the INFO: No Session found for: message (as described above).

See: https://github.com/edwardsmatt/hazelcast-tomcat-sessionmanager-42-example

After swapping the remove and set for a sessionMap.put, I can't reproduce the issue of the session not being found.

I have a fork with this fix applied, but I'm hesitant to open a PR until I understand the following in greater detail:

My concerns are that I don't completely understand the ramifications of blindly changing those two lines (and also in the updateJvmRouteForSession), particularly with the merging of remote values.

Do you think it might be simply a matter of updating the EntryListener along these lines:

      public void entryUpdated(EntryEvent<String, HazelcastSession> event) {
                    if (event.getMember() == null || !event.getMember().localMember()) {
                        sessions.put(event.getKey(), event.getMergingValue());
                    }
                }

I am working on testing this further to understand how it actually affects it, particularly with the merging of remote values and the safety and precedence of update operations (i.e. simply overwriting existing values which may result in race conditions).

Any advice based on your understanding would be appreciated 👍

gokhanoner commented 6 years ago

@edwardsmatt As you pointed out, the main with issue replacing remove and set with sessionMap.put is that there's an EntryListener & it counts on remove event. I'll also check what can be done. Your solution for EntryListener seems promising. I'll update the ticket soon.

edwardsmatt commented 6 years ago

Hi @gokhanoner, I've implemented a test version using the map entryAdded and entryUpdated EntryListener events as discussed, however what I have seen is that when a session contains an object that is not on the tomcat classpath (i.e. a custom object on the application classpath), we end up with a ClassNotFoundException when trying to deserialize the object contained within the EntryEvent. I can put together an example if that's what is required.

However, I can also verify that PR #41 fixes this issue as well.

Have you had any chance to investigate a fix, if so are you able to provide an update to see if we can progress a fix?

Thanks very much 👍

alparslanavci commented 5 years ago

@edwardsmatt thank you for providing a detailed reproduce case. It helped a lot. I sent a PR for the fix of this issue, and tested it with your case. Please see #61.

ghkarmseveer commented 5 years ago

Hi,

I've just tried using this fix (by compiling master locally) but it does not seem to fix it - I still get sessions not being found under high load.

I think the fix does not lock enough, the sessionMap.get needs to be in a lock too:

            HazelcastSession hazelcastSession;
            sessionMap.lock(id);
            try {
                hazelcastSession = sessionMap.get(id);
            } finally {
                sessionMap.unlock(id);
            }

with that in place I can simulate a high load and get no lost sessions

alparslanavci commented 5 years ago

@ghkarmseveer can you please provide a reproducer for your case? It would be really helpful for us to fix if any issues remain.

Also, are the sticky sessions enabled in your case?

ghkarmseveer commented 5 years ago

Hi,

I've shamelessly forked @edwardsmatt example and updated it to tomcat 8.5 . locally I get failures with the 1.1.4 release

https://github.com/ghkarmseveer/hazelcast-tomcat-sessionmanager-42-example

My patched branch doesn't get failures (but its not based on 1.1.4)

No I'm not using sticky sessions

alparslanavci commented 5 years ago

@ghkarmseveer, in your example, you are trying to download hazelcast-tomcat85-sessionmanager-1.1.4.jar from secure Github releases page using curl command. It will not allow to download from a secure page directly without adding certificates. Can you please try to download from maven repo instead? See the example below:

curl http://central.maven.org/maven2/com/hazelcast/hazelcast-tomcat85-sessionmanager/1.1.4/hazelcast-tomcat85-sessionmanager-1.1.4.jar > cluster/lib/hazelcast-tomcat85-sessionmanager-1.1.4.jar

I run your test with using 1.1.4 and don't face any issues. Please let us know if you are still getting the failures.

ghkarmseveer commented 5 years ago

@alparslanavci sorry about the download link but still fails for me locally:

image 88 failures on that run