Closed dreis2211 closed 5 years ago
@joakime so I am using the MessageWriter to pass to gson to stream the output of messages.
This is a vanilla jetty set up with no proxies, just embedded jetty. All the timeouts on the server have a reasonable limit as you suggest (unless there are some other magic timeouts).
I see a few comments on the getIdleTimeout on the BlockingWriteCallback. What was the issue with making this configurable, just for my understanding?
All the timeouts on the server have a reasonable limit as you suggest
This is often the key, we have folks that think they've set it, only to see that something else in their code / configuration / annotations has unset it. Try setting the idle timeout to something really low, using your techniques, and see if the idle timeout triggers. If it does, then you are setting the idle timeout properly. (feel free to reset the idle timeout to your original values now).
If idle timeout does not trigger, you have idle timeout set somewhere else.
I see a few comments on the getIdleTimeout on the BlockingWriteCallback. What was the issue with making this configurable, just for my understanding?
This was configurable on the HttpConfiguration object up until Issue #2525 deprecated / removed the entire concept (present in release Jetty 9.4.11.v20180605 onwards).
That concept, the idle timeout configuration on the SharedBlockingCallback, was an ancient safety holdover from the transition from Jetty 8 (and blocking IO) and Jetty 9 (fully async NIO). It was barely used in the Servlet side, and was only provided as an overall timeout for all blocking operations, not a specific one (a confusing concept to say the least, hence why it was so infrequently used). So if you had a blocking idle timeout of say 5,000, and you had 3 blocking writes, of write(1) [took 1,000ms], write(2) [took 3,000], and then write(3) [took 3,000] then the blocking idle timeout would trigger on write(3), even though write(3) took less then the configured 5,000ms, because the overall blocking idle timeout count was over 5,000.
I am using the MessageWriter to pass to gson to stream the output of messages.
Using streaming to read and/or write will fork a new Thread for every message. You have now increased your Threading requirements significantly. When you encounter this issue again, check your Server.threadPool, perform a jetty server dump, make sure you haven't run out of threads (one of the possibly scenarios listed above).
No it does not, it just streams on the current thread that is on. It is just passing a MessageWriter to gson's toJson method to avoid having to build the entire json string in memory that is all.
I set a time out to 10 seconds and it successfully timesout.
We can see the same thread leakage in our production system (9.4.12). But I've currently no clue how to reproduce this. Potentially we have a lot of clients or network connections, which will not be closed gracefully from client side. (Spring Boot 1.5.16, default config, InstrumentedQueuedThreadPool from metrics lib as container thread pool and InstrumentedHandler from metrics lib as handler-wrapper)
"clientOutboundChannel-687" #3348 prio=5 os_prio=0 tid=0x00007fd590390000 nid=0x2d8a waiting on condition [0x00007fd5511b6000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000008dc71ea0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:219)
at org.eclipse.jetty.websocket.common.BlockingWriteCallback$WriteBlocker.block(BlockingWriteCallback.java:90)
at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(WebSocketRemoteEndpoint.java:107)
at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendString(WebSocketRemoteEndpoint.java:394)
at org.springframework.web.socket.adapter.jetty.JettyWebSocketSession.sendTextMessage(JettyWebSocketSession.java:273)
at org.springframework.web.socket.adapter.AbstractWebSocketSession.sendMessage(AbstractWebSocketSession.java:101)
at org.springframework.web.socket.sockjs.transport.session.WebSocketServerSockJsSession.writeFrameInternal(WebSocketServerSockJsSession.java:222)
at org.springframework.web.socket.sockjs.transport.session.AbstractSockJsSession.writeFrame(AbstractSockJsSession.java:318)
at org.springframework.web.socket.sockjs.transport.session.WebSocketServerSockJsSession.sendMessageInternal(WebSocketServerSockJsSession.java:212)
at org.springframework.web.socket.sockjs.transport.session.AbstractSockJsSession.sendMessage(AbstractSockJsSession.java:162)
at org.springframework.web.socket.handler.ConcurrentWebSocketSessionDecorator.tryFlushMessageBuffer(ConcurrentWebSocketSessionDecorator.java:153)
at org.springframework.web.socket.handler.ConcurrentWebSocketSessionDecorator.sendMessage(ConcurrentWebSocketSessionDecorator.java:126)
at org.springframework.web.socket.messaging.StompSubProtocolHandler.sendToClient(StompSubProtocolHandler.java:471)
at org.springframework.web.socket.messaging.StompSubProtocolHandler.handleMessageToClient(StompSubProtocolHandler.java:458)
at org.springframework.web.socket.messaging.SubProtocolWebSocketHandler.handleMessage(SubProtocolWebSocketHandler.java:340)
at org.springframework.messaging.support.ExecutorSubscribableChannel$SendTask.run(ExecutorSubscribableChannel.java:135)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- <0x000000008dc22390> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
- <0x000000008dc71b18> (a java.util.concurrent.ThreadPoolExecutor$Worker)
I will check the idle timeout config (use default values from spring boot probably)
We are using default jetty timeouts: QueuedThreadPool: 60s WebSocketPolicy: 300s
The workaround by overriding BlockingWriteCallback class with timeout set to 60s seems to resolve our problems. I didn't read the complete discussion, but fault tolerance and resilient software design patterns should be implemented on both sides (client and server) to prevent such use cases.
@gbrehmer unfortunately your solution causes resource leaks and hampers the GC for others. So it's not a viable solution. Your solution breaks the processing at the wrong place, not allowing network resources to clean up properly. That's one example (of many) why we recommend using proper network Idle timeouts.
Thanks for your response. So you think, that in my case it's not possible that jetty's default websocket idle timeout is used? Something in my application must overwrite the default 300s? Or are there any other places in the spring boot application/embedded jetty, where the configuration of an idle timeout is missing?
In every test and example codebase we've worked (from our own code and example codebases presented by others) surrounding this issue the network idle timeouts are either not set, or set to something excessively large (eg: 30 days).
Once the network idle timeout is set, the BlockingWriteCallback is broken properly as a result of the network layer being closed and properly cleaned up. This will also properly cleanup (Close handshake) the WebSocket protocol behaviors as well.
Aborting the write at the BlockingWriteCallback aborts all of the code managing the networking layers, no cleanup of websocket (in fact the WebSocket state and protocol are now totally invalid if you do this), and no cleanup of the underlying network connection (nio, selectors, open file handles, etc).
There is only 1 scenario where things get wonky from a client perspective: A client connects through nginx to jetty, and jetty idle timeouts before nginx. nginx doesn't see this connection close and will wait for its own configured idle timeout before closing the connection to the client.
Okay If I understand you correctly, all examples to reproduce the issue contains timeout misconfiguration. I can only talk about our issues and configuration:
I will check the HA-Proxy config, but It is possible, that a proxy misconfigration can lead to a thread leakage inside the jetty embedded container?
We have improved the HAProxy timeout configuration (timeout tunnel 5m, timeout client-fin 1m), but we still see Thread leakage. Would it be helpful to generate a jetty detailed dump?
Is it possible that the spring sockjs heartbeat thread/task is levering the idle timeout logic? I saw that spring boot 2/spring framework 5 is using async write method with callback for sending websocket messages. Probably that can fix our problems, but spring upgrade is a bigger task ;)
Thread leak problem persists with Spring 5.1.3. I guess I didn't compare the correct class, because with the stomp protocol implementation the messages are still sent synchronously.
I have a web socket hangs too. This will occur after migrating from 9.3.8.v20160314 to 9.4.14.v20181114 javax-websocket-server-impl, jetty-util, jetty-webapp
@exploder86
I think we have the same issue.
But the root cause I guess was the WebSocket thread get doubled and exceed the limit of the thread number.(default is 200, even I set for 400, it gonna to reach the number still)
@fred521 that doesn't sound related.
If you have excessive thread usage consider one of the following ..
For those of you that can trigger this issue easily, please try the following snapshot release 9.4.15-20190205.170525-51
...
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-home</artifactId>
<version>9.4.15-20190205.170525-51</version>
</dependency>
Found on the official Jetty SNAPSHOT repository at https://oss.sonatype.org/content/repositories/jetty-snapshots/
@joakime thanks for the update! we will try the version next week in our production system. to know whether it's working correctly may take some time because the error often only occurs 1-2 per week and we have to check the state before redeployment, otherwise error state is cleared
@joakime Can the latest release tag 9.4.15 be pushed to the sonatype repos? because spring uses only some of the jetty modules (not jetty-home at all) and each module has another snapshot version
@gbrehmer staged release version jetty-9.4.15.v20190215
(currently undergoing testing) is available at https://oss.sonatype.org/content/repositories/jetty-1451/
@gbrehmer Jetty 9.4.15.v20190215
is now available on maven central.
We have installed the version today morning. So far no more leaks. I'll give an update after one week in production
@joakime: I've updated from 9.4.12 to 9.4.15 and got an even worse result.
Before I made sure that the fix of #3279 was also affecting my system and found Flushers
with queueSize > 0
and terminated != null
- that's why I've tried to use the new Jetty version.
But with the update the sockets got even faster locked at:
java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <568d644c> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:219)
at org.eclipse.jetty.websocket.common.BlockingWriteCallback$WriteBlocker.block(BlockingWriteCallback.java:90)
at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(WebSocketRemoteEndpoint.java:107)
at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendString(WebSocketRemoteEndpoint.java:385)
at org.springframework.web.socket.adapter.jetty.JettyWebSocketSession.sendTextMessage(JettyWebSocketSession.java:273)
at org.springframework.web.socket.adapter.AbstractWebSocketSession.sendMessage(AbstractWebSocketSession.java:101)
Instead of days it just took several minutes.
from our prod system: no more thread leaks with 9.4.15
Mh, maybe there are additional issues that result in such blockings and I'm affected by the fixed one and some others and you were only affected by the fixed one. Otherwise I don't have a better explanation for this.
@GFriedrich can you please open a new issue with the stack trace and a server dump?
@sbordet: Unfortunately I can't provide you with a server dump for various reasons 😞 - I know that this is not helpful. I just wanted to make you aware, that there is still some place hidden, where websocket threads get into a locking state.
@GFriedrich, @sbordet asked for a server dump, not a heap dump. https://www.eclipse.org/jetty/documentation/current/jetty-dump-tool.html
@joakime: Yes that's why I changed my sentence just a moment after I've posted it. But it doesn't change the conclusion. Sorry again. :worried:
@GFriedrich there is not much in a server dump that needs to be hidden: most of it are Jetty classes. Would you be able to redact the things you don't want to show? The rest may still be useful to us.
It's not so much about hiding things, it's about how I get them. I really would like to help you, but there is not much I can do from my place.
Bad news, we also see new thread leaks, 3 since friday. I have to analyze the memory dump, but it seems, that the state has changed (queue from Flusher is now always empty)
@gbrehmer interesting. keep us in the loop. Even if all you can provide is a threaddump that would at least help us point to a different place.
@gbrehmer any update? have any new thread dumps for us?
The threaddump looks the same:
attached connection:
attached flusher (queue is empty/null values in heap dump):
state:
@gbrehmer can you please take a Server dump when it happens?
sure but this can take one or two weeks to get more details, because like the first two weeks with 9.4.15 no thread leak occur. we can activate the feature on monday at the earliest (forgot to setup the server instance too, I only added the flag on the queuedthreadpool). but the same information should be in the heap dump, but not so easy to extract i guess
EDIT: probably you can give me an OQL query string to extract information with mat
@sbordet I configured jetty to dump infos to logfile on shutdown. but systemd kills the service on service restart procedure, so no jetty dump available. I assume that the thread leak is the cause for the graceful shutdown timeout (systemd: "State 'stop-final-sigterm' timed out. Killing."). We probably have to configure JMX, but this will take much more time. So I ask again, why a full heap dump is not enough?
@gbrehmer a full heap dump is enough. Please attach it here.
ah I can not attach the full dump, because it's from our prod system with customer data in it. I can execute OQL Queries on this with mat and post the result here or I can take screenshots from views like the ones above
Unfortunately, we don't know where the problem is in your particular case, so I would not know what OQL to ask you. That is why we asked for a Server dump or for the full heap dump: we would have the whole picture and we would be able to look at all the things.
Here is the requested server dump: https://pastebin.com/kjR4XbBa
@gbrehmer thank you for the dump, but we're a bit more confused now. That dump has no websocket endpoints present. None registered, none added. It does have websocket support present (as evidenced by the WebSocketUpgradeFilter), but nothing is using it (that filter has 0 mappings present).
You are right. I didn't checked the log before posting. I created a HTTP endpoint to trigger the dump creation, but probably injected the wrong webserver instance and didn't know that Spring Boot creates multiple instances. I have to go deeper into the Spring Boot code. It is possible, that the missing parts are produced by https://github.com/spring-projects/spring-framework/blob/master/spring-websocket/src/main/java/org/springframework/web/socket/server/jetty/JettyRequestUpgradeStrategy.java
It seems that websocket part is created on the fly without direct mapping to the jetty server instance. Like using the jetty websocket support in a embbeded way as part of spring webmvc controller logic
@joakime @sbordet do you have any contacts to the Spring developers? Should I raise a bug there because using WebSocketHandler manually is not supported by the jetty team? Last time they closed the bug on spring side because this bug exists
Hi guys,
we've just encountered this issue with the latest 9.4.18.v20190429 version.
Our scenario: We have a server component having a server socket open for UI clients and additionally the server component has downstream connections to other server components using Jetty WebSocket Client. In the test scenario the network cable is unplugged and afterwards our system gets blocked. We do not use any spring library, just vanilla Jetty webSocket classes.
Our observations after unplugging the network cable (and plugging it in later again):
At some point in time an exception in our sender thread [Worker-0] is thrown: WARN [Worker-0] o.e.j.w.c.e.c.CompressExtension:
java.lang.NullPointerException: Deflater has been closed
at java.util.zip.Deflater.ensureOpen(Unknown Source) ~[?:1.8.0_131]
at java.util.zip.Deflater.deflate(Unknown Source) ~[?:1.8.0_131]
at org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension$Flusher.compress(CompressExtension.java:488) ~[websocket-common-9.4.18.v20190429.jar:9.4.18.v20190429]
at org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension$Flusher.deflate(CompressExtension.java:451) ~[websocket-common-9.4.18.v20190429.jar:9.4.18.v20190429]
at org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension$Flusher.process(CompressExtension.java:431) [websocket-common-9.4.18.v20190429.jar:9.4.18.v20190429]
at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241) [jetty-util-9.4.18.v20190429.jar:9.4.18.v20190429]
at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:224) [jetty-util-9.4.18.v20190429.jar:9.4.18.v20190429]
at org.eclipse.jetty.websocket.common.extensions.compress.CompressExtension.outgoingFrame(CompressExtension.java:218) [websocket-common-9.4.18.v20190429.jar:9.4.18.v20190429]
at org.eclipse.jetty.websocket.common.extensions.ExtensionStack$Flusher.process(ExtensionStack.java:400) [websocket-common-9.4.18.v20190429.jar:9.4.18.v20190429]
at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241) [jetty-util-9.4.18.v20190429.jar:9.4.18.v20190429]
at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:224) [jetty-util-9.4.18.v20190429.jar:9.4.18.v20190429]
at org.eclipse.jetty.websocket.common.extensions.ExtensionStack.outgoingFrame(ExtensionStack.java:277) [websocket-common-9.4.18.v20190429.jar:9.4.18.v20190429]
at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.uncheckedSendFrame(WebSocketRemoteEndpoint.java:307) [websocket-common-9.4.18.v20190429.jar:9.4.18.v20190429]
at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(WebSocketRemoteEndpoint.java:106) [websocket-common-9.4.18.v20190429.jar:9.4.18.v20190429]
at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendString(WebSocketRemoteEndpoint.java:385) [websocket-common-9.4.18.v20190429.jar:9.4.18.v20190429]
at company.messaging.ws.WSSendTask.sendJsonViaSession(WSSendTask.java:62) [core-runtime-7.0.3.jar:7.0.3]
at company.messaging.ws.WSSendTask.sendResponseOrNotification(WSSendTask.java:53) [core-runtime-7.0.3.jar:7.0.3]
at company.messaging.ws.WSSendTask.run(WSSendTask.java:47) [core-runtime-7.0.3.jar:7.0.3]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:1.8.0_131]
at java.util.concurrent.FutureTask.run(Unknown Source) [?:1.8.0_131]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) [?:1.8.0_131]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) [?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:1.8.0_131]
at java.lang.Thread.run(Unknown Source) [?:1.8.0_131]
From that point on the sender thread Worker-0 is in WAITING as our thread dump can confirm and is blocked indefinitely, as a result the system is stuck:
2019-05-29 15:39:25
Full thread dump Java HotSpot(TM) Embedded Client VM (25.131-b11 mixed mode):
"JMX server connection timeout 73" - Thread t@73
java.lang.Thread.State: TIMED_WAITING
at java.lang.Object.wait(Native Method)
- waiting on <177617e> (a [I)
at com.sun.jmx.remote.internal.ServerCommunicatorAdmin$Timeout.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"RMI TCP Connection(22)-10.10.40.125" - Thread t@72
java.lang.Thread.State: RUNNABLE
at sun.management.ThreadImpl.dumpThreads0(Native Method)
at sun.management.ThreadImpl.dumpAllThreads(Unknown Source)
at sun.reflect.GeneratedMethodAccessor117.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at sun.reflect.misc.Trampoline.invoke(Unknown Source)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at sun.reflect.misc.MethodUtil.invoke(Unknown Source)
at com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(Unknown Source)
at com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(Unknown Source)
at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(Unknown Source)
at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(Unknown Source)
at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(Unknown Source)
at com.sun.jmx.mbeanserver.PerInterface.invoke(Unknown Source)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(Unknown Source)
at javax.management.StandardMBean.invoke(Unknown Source)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(Unknown Source)
at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(Unknown Source)
at javax.management.remote.rmi.RMIConnectionImpl.doOperation(Unknown Source)
at javax.management.remote.rmi.RMIConnectionImpl.access$300(Unknown Source)
at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(Unknown Source)
at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(Unknown Source)
at javax.management.remote.rmi.RMIConnectionImpl.invoke(Unknown Source)
at sun.reflect.GeneratedMethodAccessor114.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at sun.rmi.server.UnicastServerRef.dispatch(Unknown Source)
at sun.rmi.transport.Transport$1.run(Unknown Source)
at sun.rmi.transport.Transport$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$222/26122873.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- locked <77a1ee> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"qtp26395465-67" - Thread t@67
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(Unknown Source)
at sun.nio.ch.EPollSelectorImpl.doSelect(Unknown Source)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
- locked <1526db5> (a sun.nio.ch.Util$3)
- locked <84a19b> (a java.util.Collections$UnmodifiableSet)
- locked <1afe194> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(Unknown Source)
at sun.nio.ch.SelectorImpl.select(Unknown Source)
at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.select(ManagedSelector.java:464)
at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:401)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:357)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:181)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"RMI Scheduler(0)" - Thread t@51
java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <181ee17> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"Connector-Scheduler-1b7183b" - Thread t@38
java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <64c0db> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"HttpClient@193e4a8-scheduler" - Thread t@35
java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <7aae89> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"JmDNS pool-3-thread-1" - Thread t@34
java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <a35e36> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source)
at java.util.concurrent.LinkedBlockingQueue.take(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"JmDNS(dheva01.local.).State.Timer" - Thread t@33
java.lang.Thread.State: TIMED_WAITING
at java.lang.Object.wait(Native Method)
- waiting on <190b9c0> (a java.util.TaskQueue)
at java.util.TimerThread.mainLoop(Unknown Source)
at java.util.TimerThread.run(Unknown Source)
Locked ownable synchronizers:
- None
"JmDNS(dheva01.local.).Timer" - Thread t@32
java.lang.Thread.State: TIMED_WAITING
at java.lang.Object.wait(Native Method)
- waiting on <1975a67> (a java.util.TaskQueue)
at java.util.TimerThread.mainLoop(Unknown Source)
at java.util.TimerThread.run(Unknown Source)
Locked ownable synchronizers:
- None
"SocketListener(dheva01.local.)" - Thread t@31
java.lang.Thread.State: RUNNABLE
at java.net.PlainDatagramSocketImpl.receive0(Native Method)
- locked <1aa7fa> (a java.net.PlainDatagramSocketImpl)
at java.net.AbstractPlainDatagramSocketImpl.receive(Unknown Source)
- locked <1aa7fa> (a java.net.PlainDatagramSocketImpl)
at java.net.DatagramSocket.receive(Unknown Source)
- locked <e21ea1> (a java.net.DatagramPacket)
- locked <32e904> (a java.net.MulticastSocket)
at javax.jmdns.impl.SocketListener.run(SocketListener.java:41)
Locked ownable synchronizers:
- None
"saga-worker" - Thread t@30
java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <163ce07> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source)
at java.util.concurrent.LinkedBlockingQueue.take(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"Session-HouseKeeper-1a69af7" - Thread t@28
java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <bf9f26> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"qtp26395465-27-acceptor-0@18e8432-ServerConnector@1b7183b{HTTP/1.1,[http/1.1]}{0.0.0.0:8888}" - Thread t@27
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
at sun.nio.ch.ServerSocketChannelImpl.accept(Unknown Source)
at sun.nio.ch.ServerSocketChannelImpl.accept(Unknown Source)
- locked <3e8206> (a java.lang.Object)
at org.eclipse.jetty.server.ServerConnector.accept(ServerConnector.java:385)
at org.eclipse.jetty.server.AbstractConnector$Acceptor.run(AbstractConnector.java:648)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"WebSocketClient@27492551-24" - Thread t@24
java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <1db3f8c> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"WebSocketClient@27492551-23" - Thread t@23
java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <1db3f8c> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"WebSocketClient@27492551-22" - Thread t@22
java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <1db3f8c> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"WebSocketClient@27492551-21" - Thread t@21
java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <1db3f8c> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"WebSocketClient@27492551-20" - Thread t@20
java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <1db3f8c> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"WebSocketClient@27492551-19" - Thread t@19
java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <1db3f8c> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"WebSocketClient@27492551-18" - Thread t@18
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(Unknown Source)
at sun.nio.ch.EPollSelectorImpl.doSelect(Unknown Source)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
- locked <cbc7c6> (a sun.nio.ch.Util$3)
- locked <953396> (a java.util.Collections$UnmodifiableSet)
- locked <3db1f0> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(Unknown Source)
at sun.nio.ch.SelectorImpl.select(Unknown Source)
at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.select(ManagedSelector.java:464)
at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:401)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:357)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:181)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"WebSocketClient@27492551-17" - Thread t@17
java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <1db3f8c> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.idleJobPoll(QueuedThreadPool.java:858)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:783)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"Worker-0" - Thread t@14
java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <2ec724> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source)
at org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:219)
at org.eclipse.jetty.websocket.common.BlockingWriteCallback$WriteBlocker.block(BlockingWriteCallback.java:90)
at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(WebSocketRemoteEndpoint.java:107)
at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendString(WebSocketRemoteEndpoint.java:385)
at company.messaging.ws.WSSendTask.sendJsonViaSession(WSSendTask.java:62)
at company.messaging.ws.WSSendTask.sendResponseOrNotification(WSSendTask.java:53)
at company.messaging.ws.WSSendTask.run(WSSendTask.java:47)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- locked <3793c7> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"RMI TCP Accept-0" - Thread t@13
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.AbstractPlainSocketImpl.accept(Unknown Source)
at java.net.ServerSocket.implAccept(Unknown Source)
at java.net.ServerSocket.accept(Unknown Source)
at sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"RMI TCP Accept-1189" - Thread t@12
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.AbstractPlainSocketImpl.accept(Unknown Source)
at java.net.ServerSocket.implAccept(Unknown Source)
at java.net.ServerSocket.accept(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"RMI TCP Accept-0" - Thread t@11
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.AbstractPlainSocketImpl.accept(Unknown Source)
at java.net.ServerSocket.implAccept(Unknown Source)
at java.net.ServerSocket.accept(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(Unknown Source)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- None
"VM JFR Buffer Thread" - Thread t@7
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"JFR request timer" - Thread t@5
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Native Method)
- waiting on <6bcd90> (a java.util.TaskQueue)
at java.lang.Object.wait(Unknown Source)
at java.util.TimerThread.mainLoop(Unknown Source)
at java.util.TimerThread.run(Unknown Source)
Locked ownable synchronizers:
- None
"Signal Dispatcher" - Thread t@4
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"Finalizer" - Thread t@3
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Native Method)
- waiting on <13d5908> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(Unknown Source)
at java.lang.ref.ReferenceQueue.remove(Unknown Source)
at java.lang.ref.Finalizer$FinalizerThread.run(Unknown Source)
Locked ownable synchronizers:
- None
"Reference Handler" - Thread t@2
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Native Method)
- waiting on <12cfae> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Unknown Source)
at java.lang.ref.Reference.tryHandlePending(Unknown Source)
at java.lang.ref.Reference$ReferenceHandler.run(Unknown Source)
Locked ownable synchronizers:
- None
"main" - Thread t@1
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Native Method)
- waiting on <3739af> (a java.lang.Object)
at java.lang.Object.wait(Unknown Source)
at org.eclipse.jetty.util.thread.QueuedThreadPool.join(QueuedThreadPool.java:498)
at org.eclipse.jetty.server.Server.join(Server.java:558)
at company.ClusterCore.join(ClusterCore.java:98)
at company.ClusterKitt.join(ClusterKitt.java:71)
at company.apps.bootstrap.AppsBootstrapper.main(AppsBootstrapper.java:19)
Locked ownable synchronizers:
- None
Update: As we had some time-pressure to address this issue we choose the way to override the getIdleTimeout method in SharedBlockingCallback of the jetty-util artifact. This work around is working in our case.
I can see the NullPointerException coming from the CompressExtension as well, but I can't really connect it directly to the times where we see blocked threads (yet). Still this might be a hint of what's going wrong.
Hi,
it was suggested to me by @joakime to open a new issue for #272 as it still occurs on 9.4.8 - with a slight tendency to occur more often now (which might be just bad luck on our end).
Unfortunately, I can't say anything new about the issue. It still appears to be random (regardless of load for example) that threads end up in WAITING state and only a server restart helps to solve the issue.
As this is affecting our production servers, I'd appreciate if this is investigated again. I'd also appreciate any workaround that doesn't suggest a server restart.
Cheers, Christoph