gautamaino / gwteventservice

Automatically exported from code.google.com/p/gwteventservice
Other
0 stars 0 forks source link

Deadlock in DefaultEventRegistry (GWTEventService 1.0.1) #8

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Scenario:

We have a thread "StatusMessageListenerThread" in our application that 
acts as an event producer for the EventExecutorService.
The events are not user-specific and all client browsers will listen for 
these events.

When the StatusMessageListenerThread posts a new event it will execute the 
following line of code:

EventExecutorServiceFactory.getInstance().getEventExecutorService
(null).addEvent(DOMAIN, new StatusMessageEvent(currentStatus));

Two days ago, I realized that our whole messaging subsystem has grinded to 
a halt. On further investigation I found a deadlock
in DefaultEventRegistry that can occur when a client browser times out at 
the same time when a new event is added to the
EventExecutorService.

You can see this in the following thread dumps:

//        Name: StatusMessageListenerThread
//        State: BLOCKED on 
de.novanic.eventservice.service.registry.DefaultEventRegistry$UserInfo@432e
d2 owned by: Timer-802
//        Total blocked: 17  Total waited: 0
//
//        Stack trace:
//        
de.novanic.eventservice.service.registry.DefaultEventRegistry$UserInfo.doNo
tifyAll(Unknown Source)
//        
de.novanic.eventservice.service.registry.DefaultEventRegistry$UserInfo.addE
vent(Unknown Source)
//        
de.novanic.eventservice.service.registry.DefaultEventRegistry.addEvent
(Unknown Source)
//        
de.novanic.eventservice.service.registry.DefaultEventRegistry.addEvent
(Unknown Source)
//           - locked 
de.novanic.eventservice.service.registry.DefaultEventRegistry@a54312
//        
de.novanic.eventservice.service.DefaultEventExecutorService.addEvent
(Unknown Source)
//        
com.istec.pls.base.StatusMessageListener$StatusMessageListenerThread.publis
hCurrentStatus(StatusMessageListener.java:176)
//        
com.istec.pls.base.StatusMessageListener$StatusMessageListenerThread.run
(StatusMessageListener.java:140)

This is our event producer thread. You see, it has acquired a lock on the 
DefaultEventRegistry singleton,
due to "DefaultEventRegistry#addEvent(Domain aDomain, Event anEvent)" 
being synchronized, and now waits to get the lock on
the UserInfo instance "432ed2" (which is the client session that timed 
out) in "UserInfo#addEvent(Domain aDomain, Event anEvent)"

//    Name: Timer-802
//    State: BLOCKED on 
de.novanic.eventservice.service.registry.DefaultEventRegistry@a54312 owned 
by: StatusMessageListenerThread
//    Total blocked: 1  Total waited: 1
//
//    Stack trace:
//    
de.novanic.eventservice.service.registry.DefaultEventRegistry.addEventUserS
pecific(Unknown Source)
//    
de.novanic.eventservice.service.registry.DefaultEventRegistry.unlisten
(Unknown Source)
//       - locked 
de.novanic.eventservice.service.registry.DefaultEventRegistry$UserInfo@432e
d2
//    
de.novanic.eventservice.service.registry.DefaultEventRegistry$UserInfo$Sche
duleTimer.run(Unknown Source)
//    java.util.TimerThread.mainLoop(Timer.java:512)
//    java.util.TimerThread.run(Timer.java:462)

"Timer-802" is the timer thread of the UserInfo instance "432ed2" which 
timed out. It has acquired a lock on the UserInfo instance "432ed2"
in DefaultEventRegistry#unlisten(String aUserId) and now attempts to 
acquire the lock on the DefaultEventRegistry singleton in the
call to DefaultEventRegistry#addEventUserSpecific(UserInfo aUserInfo, 
Event anEvent) which is synchronized.

Bummer! The world has stopped. Needless to mention that eventually all 
timer threads will block in the same way that Timer-802 is doing.

Fortunately, this didn't happen (yet) in one of our production 
applications, but it is a time bomb we are sitting on.
I want to fix this ASAP. I've analyzed the code a little and maybe I can 
come up with a patch today that you could have a look at?

Cheers,
Stefan

Original issue reported on code.google.com by stefan.z...@gmail.com on 6 Sep 2009 at 1:50

GoogleCodeExporter commented 8 years ago
Here is the suggested patch. What do you think about it?

Original comment by stefan.z...@gmail.com on 6 Sep 2009 at 2:07

Attachments:

GoogleCodeExporter commented 8 years ago
The patch is ok and will be integrated in version 1.0.2. Thanks for your 
help/work!

Full conversation:
http://gwteventservice.freeforums.org/deadlock-in-defaulteventregistry-gwtevents
ervice-1-0-1-t29.html

Original comment by sven.strohschein@googlemail.com on 10 Sep 2009 at 10:37

GoogleCodeExporter commented 8 years ago
Fixed in version 1.0.2 and 1.1.

Original comment by sven.strohschein@googlemail.com on 10 Sep 2009 at 9:30