Closed spring-projects-issues closed 8 years ago
Rossen Stoyanchev commented
Threads 33, 31, 29, 27 are waiting for the same SseEmitter instance <0x00000006e6a04620> but they're not holding any other locks that would prevent theirs from being released. Thread 25 is the thread that's currently holding the same SseEmitter instance and it is in turn waiting on a CounDownLatch deep in Tomcat writing code.
So from a server perspective I don't see any deadlock. The synchronization is doing just what it is supposed to with 4 threads waiting for their turn while a 5th one is writing. The question is what makes thread 25 stop without completing the write. What is it waiting on? Perhaps a better question for the Tomcat mailing list to shed some light based on where specifically it is blocked as shown by the stack trace.
Is it only reproducible with multiple tabs in the same browser? What about using a different browser? Perhaps Chrome is locked waiting on one of these connections? I can't imagine why but I can't explain it otherwise from the server-side stack trace.
Martin Mace commented
I just added a few more details in the description about the execution scenario. It would be nice to use another browser but i don't have another one that supports SSE. Will try one (another Chrome) on another machine. Indeed i was suspect on the web server (Tomcat's) because of some locks that are shown in the stack. The blocked threads might wait on a condition (ie, ACK) that the browser doesn't send, but it is the SSE Spring code that shows the blocked threads.
Martin Mace commented
I have tested from two and three browsers on different machines, on each machine i had two tabs (one was incognito, but shouldn't matter). In this test i noticed that the number of invocation is proportional with the number of machines. For example, on two machines the reported issue occurred after 4th invocation, on three machines was more (5 or 6). In any case, it seems to not be caused by the Spring code; it seems to be environmental. I would have to consider other client app(s) and perhaps alter the scenario. I would suggest closing the issue. Thank you for the time. Regards Martin
Rossen Stoyanchev commented
Yes this is what I see in the stack trace also. In other words nothing to indicate an issue or anything wrong on the server side. A deadlock would involve two threads each holding a lock the other is waiting for. That's not what's happening though. Even if we simply removed the synchronization in SseEmitter, you would have to synchronize yourself because you're not supposed to write the same response --ultimately an OutputStream, simultaneously. And then you'd be back in exactly the same situation.
I would try to debug or log more on the client side to see what's happening there in each tab. It almost looks like there is some sort of multiplexing and/or locking going on. You could also try running with just one tab from different machines/browsers and that would narrow it down to using multiple tabs.
Rossen Stoyanchev commented
Resolving for now as it doesn't seem related to anything we are doing.
Martin Mace opened SPR-14708 and commented
A test client has been created (as a PoC) to test SSE functionality. Chrome was used as a client to access a REST endpoint. Once the connection was established, an SseEmitter instance is created and stored in a list of subscribers. Subsequently, SSE events are being pushed to the subscribers. The events are generated in a separate thread (1k events) and sent to all subscribers:
Multiple connections are attempted from the same browser. After three connections were established the notifications stop, no events are being pushed anymore. Looking in the stack it is noticed that there seems to be a deadlock in sending the events. The culprit classes seem to be SseEmitter.java and/or ResponseBodyEmitterReturnValueHandler.java Please see the attached stack.
The scenario is as follows:
The SseEmitter was created with no timeout, basically: LONG.MAX_VALUE.
Affects: 4.3.2
Attachments: