jetty / jetty.project

Eclipse Jetty® - Web Container & Clients - supports HTTP/2, HTTP/1.1, HTTP/1.0, websocket, servlets, and more
https://eclipse.dev/jetty
Other
3.84k stars 1.91k forks source link

WebSocket hangs in blockingWrite #272

Closed jmcc0nn3ll closed 5 years ago

jmcc0nn3ll commented 8 years ago

migrated from Bugzilla #484698 status NEW severity normal in component websocket for 9.3.x Reported in version 9.3.5 on platform PC Assigned to: Joakim Erdfelt

On 2015-12-18 10:48:14 -0500, Ben Tatham wrote:

My application hangs in the following state. This is usually triggered by some underlying network change (like ip address change of the host running jetty), but it might happen in other cases too.

java.lang.Thread.State: WAITING at sun.misc.Unsafe.park(Native Method)

  • parking to wait for <1edddee> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(Unknown Source) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source) at org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:210) at org.eclipse.jetty.websocket.common.BlockingWriteCallback$WriteBlocker.block(BlockingWriteCallback.java:82) at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(WebSocketRemoteEndpoint.java:107) at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendBytes(WebSocketRemoteEndpoint.java:252)

Based on the code, it looks like the m_complete.await() call should be signalled on close, failure, or success, but never is.

amitdev commented 8 years ago

I'm also hitting an application hang with thread dump showing similar stack trace. Is this issue still there on 9.3.6?

amitdev commented 8 years ago

At least in my case, it looks like the client did not send response back to the server when it did a sendString. There is also an associated timeout in logs (jetty 9.3.6):

02:23:41.301 [Scheduler-1157726741] ERROR c.o.o.c.endpoints.WebSocketEndpoint - Error in websocket session java.net.SocketTimeoutException: Timeout on Read at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onReadTimeout(AbstractWebSocketConnection.java:576) [websocket-common-9.3.6.v20151106.jar:9.3.6.v20151106] at org.eclipse.jetty.io.AbstractConnection.onFillInterestedFailed(AbstractConnection.java:162) [jetty-io-9.3.6.v20151106.jar:9.3.6.v20151106] at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillInterestedFailed(AbstractWebSocketConnection.java:522) [websocket-common-9.3.6.v20151106.jar:9.3.6.v20151106] at org.eclipse.jetty.io.AbstractConnection$ReadCallback.failed(AbstractConnection.java:267) [jetty-io-9.3.6.v20151106.jar:9.3.6.v20151106] at org.eclipse.jetty.io.FillInterest.onFail(FillInterest.java:125) [jetty-io-9.3.6.v20151106.jar:9.3.6.v20151106] at org.eclipse.jetty.io.ssl.SslConnection.onFillInterestedFailed(SslConnection.java:216) [jetty-io-9.3.6.v20151106.jar:9.3.6.v20151106] at org.eclipse.jetty.io.AbstractConnection$ReadCallback.failed(AbstractConnection.java:267) [jetty-io-9.3.6.v20151106.jar:9.3.6.v20151106] at org.eclipse.jetty.io.FillInterest.onFail(FillInterest.java:125) [jetty-io-9.3.6.v20151106.jar:9.3.6.v20151106] at org.eclipse.jetty.io.AbstractEndPoint.onIdleExpired(AbstractEndPoint.java:163) [jetty-io-9.3.6.v20151106.jar:9.3.6.v20151106] at org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166) [jetty-io-9.3.6.v20151106.jar:9.3.6.v20151106] at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50) [jetty-io-9.3.6.v20151106.jar:9.3.6.v20151106] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_65] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_65] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_65] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_65] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_65] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_65] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_65]

And all the threads hangs at the following state. I have to restart the process to recover:

"qtp1849201180-18080" #18080 prio=5 os_prio=0 tid=0x00007f5f2aa91000 nid=0x28a5 waiting on condition [0x00007f5f0c783000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method)

  • parking to wait for <0x0000000700e3b8c0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(Redefined) at org.eclipse.jetty.websocket.common.BlockingWriteCallback$WriteBlocker.block(Redefined) at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(Redefined) at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendString(Redefined)
phoenix commented 8 years ago

Is there any plan to fix this?

joakime commented 8 years ago

@phoenix as there's no reproducer for this, its hard to identify the problem and implement a fix.

jonhill1977 commented 8 years ago

We see the same issue on 9.3.7.v20160115:

"pool-18-thread-42" #6179 prio=5 os_prio=0 tid=0x00007f7be0262800 nid=0x1741 waiting on condition [0x00007f7b1a21a000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x0000000687b00e08> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:210)
    at org.eclipse.jetty.websocket.common.BlockingWriteCallback$WriteBlocker.block(BlockingWriteCallback.java:82)
    at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(WebSocketRemoteEndpoint.java:107)
    at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendString(WebSocketRemoteEndpoint.java:387)

As a workaround (to prevent threads permanently hanging) we have added the following into org.eclipse.jetty.websocket.common.BlockingWriteCallback:

@Override
protected long getIdleTimeout()
{
  return 30000L;
}

This isn't a proper fix but at least appears to timeout the blocking write, preventing permanent thread lock (eventually leading to thread starvation in our thread pool). I appreciate this is difficult to replicate but perhaps you could consider exposing the timeout so we can set it to a non-infinite value without having to recompile this jar?

joakime commented 8 years ago

@jonhill1977 do you have a reproduction / testcase?

jonhill1977 commented 8 years ago

@joakime Unfortunately not. I appreciate how hard it is for you to resolve this without reproduction steps. It happened in two of our production environments in a 12 hour period, but it hasn't happened in our test environment, so I assume it's either related to the increased load in production, or more likely a client socket is disconnecting unexpectedly (or otherwise not responding as expected).

If you can't resolve the underlying issue, could you at least consider allowing access to the IdleTimeout (which is currently implemented as always returning -1 in the superclass) as suggested in my workaround given previously? At least that way, we could set the timeout (in Jetty config or from our code) and we wouldn't have to recompile the jar when we upgrade to the latest versions of Jetty.

gregw commented 8 years ago

@joakime I think the timeout mechanism in this class was implemented for when we had similar problems with HttpOutput. We expect to always get a success or failure callback, but if they never come, then we block forever.

We override in HttpOutput to look at getHttpConfiguration().getBlockingTimeout(). Currently we have solved all known problems in HttpOutput, so our default is -1.

So you could override in websocket to look for a similar configuration, or perhaps just change to return a very long timeout (1h), so at least it is not infinite....

But essentially that is just a band aid, if a reproduction cannot be found, it may be better to use the timeout to trigger some diagnostic code that reports the history and current state of the websocket that might suggest why it has neither succeeded or failed.

askaribragimov commented 7 years ago

+1 to allowing access. I might see the issue in my Spring app: when a websocket client is consuming very slowly (by waiting 4 s before "finalization" of consuming) you can end up in this. Attaching my thread dumps from slow clients. http://pastebin.com/zE16BBgV

Presently have a reproduction "kit". I can outline how it can be made using Spring

RMachnik commented 7 years ago

Hi Guys, I am struggling with this issue. It is clearly happening when WS client is killed without sending response. Jetty is hung on this method -> WebSocketRemoteEndpoint.java:387. It has bad consequences in starvation of other clientOutbound threads that wants to send Heartbeats to this particular user. Then only recovery is to restart Jetty.

Is there anyone working on this or have some workaround for that?

askaribragimov commented 7 years ago

@rafik991 This looks like the issue I had. You can try to get source for Jetty and modify method that jonhill1977 commented on Jul 26, 2016 described.

It got rid of hanged threads evetually

aigor commented 7 years ago

Hi Guys, I have the same issue: threads that send outgoing messages sometimes hang due to broken websockets. Unfortunately I can not use patched Jetty library, so I am using separate thread pool for sending messages in order to decrease impact of hanged threads.

It would be great if library itself had this fix!

askaribragimov commented 7 years ago

Simply setting non-infinite getIdleTimeout() in the source code would aready help. I hope library makers will abandon the infinite waiting time or make an option!

joakime commented 7 years ago

@askaribragimov which idle timeout are you referring to? (there are many in Jetty, each referring to a different layer in the stack)

ruslan-reshetnikov commented 7 years ago

@joakime We have the same issue.

We call SimpMessagingTemplate.convertAndSend from 3 different threads:

thread 1:

thread 2:

thread 3:

Outgoing messages sending hangs after 20-30 hours.

We use Jetty as embedded servlet container for Spring Boot application. Spring Boot Jetty Starter ver. 1.4.1.RELEASE https://mvnrepository.com/artifact/org.springframework.boot/spring-boot-starter-jetty/1.4.1.RELEASE

See thread dump below:

"clientOutboundChannel-1" #136 prio=5 os_prio=0 tid=0x0000000042a38000 nid=0x856c waiting on condition [0x00000000537fe000]
   java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for  <0x00000003ca68ae30> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.park(Unknown Source)
 at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source)
 at org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:210)
 at org.eclipse.jetty.websocket.common.BlockingWriteCallback$WriteBlocker.block(BlockingWriteCallback.java:82)
 at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(WebSocketRemoteEndpoint.java:107)
 at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendString(WebSocketRemoteEndpoint.java:393)
 at org.springframework.web.socket.adapter.jetty.JettyWebSocketSession.sendTextMessage(JettyWebSocketSession.java:190)
 at org.springframework.web.socket.adapter.AbstractWebSocketSession.sendMessage(AbstractWebSocketSession.java:104)
 at org.springframework.web.socket.handler.ConcurrentWebSocketSessionDecorator.tryFlushMessageBuffer(ConcurrentWebSocketSessionDecorator.java:132)
 at org.springframework.web.socket.handler.ConcurrentWebSocketSessionDecorator.sendMessage(ConcurrentWebSocketSessionDecorator.java:104)
 at org.springframework.web.socket.messaging.StompSubProtocolHandler.sendToClient(StompSubProtocolHandler.java:437)
 at org.springframework.web.socket.messaging.StompSubProtocolHandler.handleMessageToClient(StompSubProtocolHandler.java:422)
 at org.springframework.web.socket.messaging.SubProtocolWebSocketHandler.handleMessage(SubProtocolWebSocketHandler.java:338)
 at org.springframework.messaging.support.ExecutorSubscribableChannel$SendTask.run(ExecutorSubscribableChannel.java:135)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)

   Locked ownable synchronizers:
 - <0x00000003c900ae40> (a java.util.concurrent.ThreadPoolExecutor$Worker)
 - <0x00000003ca6933f8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
joakime commented 7 years ago

@ruslan-reshetnikov that's useful, thank you. What version of Jetty are you using?

ruslan-reshetnikov commented 7 years ago

@joakime 9.3.11.v20160721

rototor commented 7 years ago

I've got the same issue with 9.4.3.v20170317. Always at the same place and always when closing a connection which might be already closed.

public static abstract class ChatWebSocketAdapterBase extends WebSocketAdapter implements ChatMessageTransport {
        public void close() {
            try {
                if (isNotConnected())
                    return;
                getRemote().flush();
                getSession().close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

Is this a usage bug? close() is called when the user ends his chat session. The websocket might already be closed by the client browser. And I've got a nginx in front of jetty.

Http Server Thread Pool-11227
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:210)
org.eclipse.jetty.websocket.common.BlockingWriteCallback$WriteBlocker.block(BlockingWriteCallback.java:90)
org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.flush(WebSocketRemoteEndpoint.java:458)
de.typografica.shop.web.frontend.chat.FrontendChatServlet$ChatWebSocketAdapterBase.close(FrontendChatServlet.java:88)
rototor commented 7 years ago

My users have just started to use the chat feature more. And now I have already 4 threads hanging on Blocker.block(). So I can reproduce this problem in production. As a short term workaround I will increase my overall threadpool size, but a real fix would be better, as the threadpool will likely get full over time. Or do you suggest another workaround for this problem?

rototor commented 7 years ago

I did a little bit more testing. If I remove the getRemote().flush(); call the problem disappears. So I think the problem happens when you flush() or maybe even send???() to a socket which was closed on the other side (which is a webpage in Chrome in my case).

bdarwin commented 7 years ago

Even we are facing this issue on a regular basis in production. Problem is as exactly described by @ruslan-reshetnikov .

joakime commented 7 years ago

@rototor, @bdarwin do you have idle timeouts defined?

The Blocker.block() will break out when the Endpoint has an idle timeout event.

We have many clients that do not experience these issues, and attempts at replicating have yet to succeed.

bdarwin commented 7 years ago

Yes @joakime we have idle timeout on httpconfiguration about 30mins or so. But as per our last observation the thread was blocked more than 8 hours. Yes it is not easy to reproduce. We have 5 servers and it happens only on 1 server. We followed @jonhill1977 suggestion and deployed. Hopefully things will get better.

askaribragimov commented 7 years ago

It is very easy to reproduce. I do not have a special code at hand, but here goes the main principle. YOu will need Spring, eg 4.3.3. Something maybe can be just harvested here https://github.com/nickebbutt/stomp-websockets-java-client

In Java client, find method HandleFrame ( https://github.com/nickebbutt/stomp-websockets-java-client/blob/master/stomp-java-client/src/main/java/com/od/helloclient/HelloClient.java ) add a long Thread.sleep to it and you will get hanged Jetty thread

rototor commented 7 years ago

@joakime I have a idle timeout of one hour on the ServerConnector, but I don't think thats used here. BlockingWriteCallback does not override getIdleTimeout() from base class SharedBlockingCallback, so it's -1. Or should the blocker get any async notification (i.e. Thread.interrupt())?

I tried to build a small reproducing demo, but I could only get some strange exceptions when having many clients, but no hanging connections. e.g. in WebSocketeRemoteEndpoint.lockMsg this exception

                    if ((state & BLOCK_MASK) != 0)
                        throw new IllegalStateException(String.format("Blocking message pending %x for %s", state, type));

was thrown many times. No idea if this is a bug or expected. I can provide you the demo if you want to investigate that.

joakime commented 7 years ago

@rototor a demo that reproduces this specific issue (WebSocket hangs in blockingWrite) would be ideal, yes.

spjoe commented 7 years ago

Run into the same problem.

Just had a quick look at it, but I think I have already found one race condition: https://github.com/eclipse/jetty.project/blob/jetty-9.3.x/jetty-websocket/websocket-common/src/main/java/org/eclipse/jetty/websocket/common/io/FrameFlusher.java

If thread T1 executing line 337 til 347 and then switches to Thread T2 which is before 314 and runs til line 332 and then T1 continues til 381, be aware that flusher.iterate() is called in state FAILED! So nothing will be processed anymore also the last FrameEntry in the queue will stay forever.

One callback will never be closed, neither success nor failure.

dreis2211 commented 6 years ago

@joakime Any thoughts on @spjoe's comment? We're running into the same problem described by @ruslan-reshetnikov with 9.4.7

joakime commented 6 years ago

@dreis2211 the IteratingCallback mechanism that FrameFlusher uses is used in many places in Jetty, the problem described by @spjoe would manifest in regular HTTP with Servlet 2.5 API usage as well (especially so with HTTP/2 usage).

This problem still does not have a reproduction test case. Many different scenarios have been attempted to reproduce this. Large through small scale configurations, on various platforms, using many different varieties of behaviors, still no reproduction case. Hundreds of billions of websocket messages have been sent/received successfully with the various test cases of our own and various other projects / large users of the websocket library in Jetty.

There's nothing to fix yet, as the cause cannot even be explained.

spjoe commented 6 years ago

@dreis2211 my recommendation is to switch to netty, they have a well defined(maybe also jetty has, but I was not able to find good documentation about that) and easier threading model which IMHO leads to less bugs. This is what I have done and I do not regret it :-). Sorry for advertising a rivalry product.

@joakime HTTP/2 uses code from websocket-common, are you sure? I showed one race condition which does not mean that there are many more, but I do not understand how you can conclude that the cause cannot even be explained. Also in general this is quite a bold statement :-)

My suggestion for reproduction is to write a test with my described threading behavior, look into DeterministicScheduler from jMock(Maybe you have to refactor a bit to really execute just the parts you want). With this tool I was able to find race condition also in other libraries. I most likely will not do it, because I currently do not use jetty.

joakime commented 6 years ago

IteratingCallback is a core class, used for handling callback logic in many places in Jetty. Its use in WebSocket's FrameFlusher is only 1 such place. There are many other uses of IteratingCallback within Jetty, such as the standard Servlet spec 2.5 blocking behaviors and HTTP/2. The reported suspects for this issue would also manifest in those other places, which it doesn't.

@spjoe no worries about advertising other open source projects. :-) The netty project and eclipse jetty project are quite friendly to each other. Heck, netty even uses our alpn-boot layers for HTTP/2 support.

joakime commented 6 years ago

Links to suggestions from @spjoe

joakime commented 6 years ago

@dreis2211 since you stated you are using 9.4.7 ... Do you have an up to date threaddump showing this issue on that version of Jetty?

sbordet commented 6 years ago

@joakime the race reported by @spjoe is legit and should be fixed. It has nothing to do with IteratingCallback.

HTTP2Flusher is implemented differently, as it atomically checks whether it's closed/failed before accepting an entry in the queue.

I think we should fix it regardless of the WebSocket work in Jetty 10. I can give it a shot, if you want me to.

dreis2211 commented 6 years ago

@joakime I'm desperately trying to reproduce it again. Unfortunately, I didn't keep the dump and found this issue only afterwards. At least for us it was a "one-in-a-million" (probably) bug that never happened (or reported) before. So not that critical for us. It's just unfortunate that only restarting helped.

joakime commented 6 years ago

@sbordet obviously, go for it. heh.

The FrameFlusher in Jetty 10 is the same as in Jetty 9.3.x+, so it would need love there too.

dreis2211 commented 6 years ago

@joakime Just had it again.

I can provide the full threaddump if you want, but basically that's the part which is relevant to this issue

"clientOutboundChannel-221" #26291 prio=5 os_prio=0 tid=0x00007f0de0070800 nid=0x4be0 waiting on condition [0x00007f0d739c7000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000cac6b8a0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:210)
    at org.eclipse.jetty.websocket.common.BlockingWriteCallback$WriteBlocker.block(BlockingWriteCallback.java:90)
    at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(WebSocketRemoteEndpoint.java:107)
    at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendString(WebSocketRemoteEndpoint.java:393)
    at org.springframework.web.socket.adapter.jetty.JettyWebSocketSession.sendTextMessage(JettyWebSocketSession.java:273)
    at org.springframework.web.socket.adapter.AbstractWebSocketSession.sendMessage(AbstractWebSocketSession.java:101)
    at org.springframework.web.socket.handler.ConcurrentWebSocketSessionDecorator.tryFlushMessageBuffer(ConcurrentWebSocketSessionDecorator.java:132)
    at org.springframework.web.socket.handler.ConcurrentWebSocketSessionDecorator.sendMessage(ConcurrentWebSocketSessionDecorator.java:104)
    at org.springframework.web.socket.messaging.StompSubProtocolHandler.sendToClient(StompSubProtocolHandler.java:470)
    at org.springframework.web.socket.messaging.StompSubProtocolHandler.handleMessageToClient(StompSubProtocolHandler.java:457)
    at org.springframework.web.socket.messaging.SubProtocolWebSocketHandler.handleMessage(SubProtocolWebSocketHandler.java:338)
    at org.springframework.messaging.support.ExecutorSubscribableChannel$SendTask.run(ExecutorSubscribableChannel.java:135)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

It's essentually the one that @ruslan-reshetnikov described already, but with 9.4.7.

Cheers, Christoph

joakime commented 6 years ago

Spring + Stomp + Custom spring executors + sessions that have been decorated via the DecoratedObjectFactory. Pretty complex setup you got there.

ghost commented 6 years ago

I suspect that Christoph has a pretty much default Spring Stomp stack at work.

I use the same, I saw the same when I was still able to reproduce it.

dreis2211 commented 6 years ago

@joakime & @onkami Indeed, quite a common setup with Spring-Boot.

joakime commented 6 years ago

Spring-Boot. good to know. Are there any open source projects with this kind of setup already? Be good to have a real-world-ish testcase for it. (make sure we didn't break API that spring expects)

ghost commented 6 years ago

https://spring.io/guides/gs/messaging-stomp-websocket/

They use a Javascript client but there is also a demo Java client (also jetty-based) in Spring itself.

dreis2211 commented 6 years ago

@joakime Regarding the API expectations I found an issue in the Spring-Framework core https://jira.spring.io/browse/SPR-14757 that references this issue and has one particular comment.

Looking at ConcurrentWebSocketSessionDecorator we probably should add an isOpen() check in the loop that sends messages but I'd expect the WebSocket runtime, Jetty it seems, to throw an exception if we try to send after the session is closed.

Any thoughts on this?

Thanks by the way for helping!

ghost commented 6 years ago

Note that Sping can use either Tomcat or Jetty and you should do some stuff in your maven/gradle build file in order to switch to Jetty, but it will be straightforward enough and easy google-able http://therealdanvega.com/blog/2017/05/10/spring-boot-moving-tomcat-jetty

ghost commented 6 years ago

@dreis2211 you found my ticket yay

sbordet commented 6 years ago

@joakime, the CometD build against commit 4389da9 above passes cleanly.

joakime commented 6 years ago

The issue-272-ws-blockingwrite branch has been merged back to jetty-9.4.x

dreis2211 commented 6 years ago

Hey @joakime.

Is there an ETA when 9.4.8 could be released in a non-snapshot version?

sbordet commented 6 years ago

@dreis2211 9.4.8 is close, but we need to figure out a performance/livelock issue first. We don't have a precise date yet though.

dreis2211 commented 6 years ago

@sbordet I appreciate the feedback nonetheless. Good luck with solving the livelock issue.