jetty / jetty.project

Eclipse Jetty® - Web Container & Clients - supports HTTP/2, HTTP/1.1, HTTP/1.0, websocket, servlets, and more
https://eclipse.dev/jetty
Other
3.87k stars 1.91k forks source link

100% CPU usage in Selector using Jetty on Windows #2205

Closed joemokos closed 2 years ago

joemokos commented 6 years ago

I am running embedded jetty 9.4.8.v20171121. My app runs fine for a period of time, usually about 2 hours, with very little CPU usage. Then, 2 threads, both named WebSocketContainer@1861866092-, start to consume 100% CPU when idle. A stack trace at this point yields the following:

 Thread id32:WebSocketContainer@1861866092-32     RUNNABLE     55.84375s
 sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method)
 sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:296)
 sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(WindowsSelectorImpl.java:278)
 sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:159)
 sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
 sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
 sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
 org.eclipse.jetty.io.ManagedSelector$SelectorProducer.select(ManagedSelector.java:375)
 org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:304)
 org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:179)
 org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
 org.eclipse.jetty.io.ManagedSelector$$Lambda$23/663060787.run(Unknown Source)
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
 org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
 java.lang.Thread.run(Thread.java:748)

This seems very similar to: https://github.com/eclipse/jetty.project/issues/1446

but I am running a newer version of jetty.

I can only get this problem to occur on servers running Windows 2012R2. I am running java jdk1.8.0_162 although I have also seen the problem with jdk1.8.0_131.

Does anyone have any ideas what could be causing this?

joakime commented 6 years ago

This seems to reference Issue #1446 (using github syntax to trigger reference link)

joemokos commented 6 years ago

Agreed, it does seem to reference #1446. However #1446 says problem seemed to be corrected by jetty 9.4.6.v20170531. I have been running with 9.4.7.v20170914 and recently upgraded to 9.4.8.v20171121 in hopes that in would correct the problem. But it does not. So if the problem was corrected in 9.4.6, it looks like it returned in 9.4.7 and 9.4.8

joemokos commented 6 years ago

I was able to get some debug logging turned on when the problem occurred. It looks like the same pattern is repeated constantly. Here's a snippet. Full logs are also available if you need them:

03/14/2018 18:04:58.782 DEBUG jetty WebSocketContainer@1018635868-36: Selector sun.nio.ch.WindowsSelectorImpl@142a7d96 woken up from select, 0/0 selected 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-36: Selector sun.nio.ch.WindowsSelectorImpl@142a7d96 processing 0 keys, 0 actions 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-36: action=null wakeup=false 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-36: Selector sun.nio.ch.WindowsSelectorImpl@142a7d96 waiting on select 03/14/2018 18:04:58.782 DEBUG jetty WebSocketContainer@1018635868-31: Selector sun.nio.ch.WindowsSelectorImpl@4193294a woken up from select, 0/0 selected 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-31: Selector sun.nio.ch.WindowsSelectorImpl@4193294a processing 0 keys, 0 actions 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-31: action=null wakeup=false 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-31: Selector sun.nio.ch.WindowsSelectorImpl@4193294a waiting on select 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-36: Selector sun.nio.ch.WindowsSelectorImpl@142a7d96 woken up from select, 0/0 selected 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-36: Selector sun.nio.ch.WindowsSelectorImpl@142a7d96 processing 0 keys, 0 actions 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-36: action=null wakeup=false 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-36: Selector sun.nio.ch.WindowsSelectorImpl@142a7d96 waiting on select 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-76: action=null wakeup=false 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-76: Selector sun.nio.ch.WindowsSelectorImpl@38021066 waiting on select 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-76: Selector sun.nio.ch.WindowsSelectorImpl@38021066 woken up from select, 0/0 selected 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-76: Selector sun.nio.ch.WindowsSelectorImpl@38021066 processing 0 keys, 0 actions 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-76: action=null wakeup=false 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-76: Selector sun.nio.ch.WindowsSelectorImpl@38021066 waiting on select 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-76: Selector sun.nio.ch.WindowsSelectorImpl@38021066 woken up from select, 0/0 selected 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-32: Selector sun.nio.ch.WindowsSelectorImpl@3ae48017 woken up from select, 0/0 selected 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-32: Selector sun.nio.ch.WindowsSelectorImpl@3ae48017 processing 0 keys, 0 actions 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-32: action=null wakeup=false 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-32: Selector sun.nio.ch.WindowsSelectorImpl@3ae48017 waiting on select 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-53: Selector sun.nio.ch.WindowsSelectorImpl@72f12611 woken up from select, 0/0 selected 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-53: Selector sun.nio.ch.WindowsSelectorImpl@72f12611 processing 0 keys, 0 actions 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-53: action=null wakeup=false 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-53: Selector sun.nio.ch.WindowsSelectorImpl@72f12611 waiting on select 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-32: Selector sun.nio.ch.WindowsSelectorImpl@3ae48017 woken up from select, 0/0 selected 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-32: Selector sun.nio.ch.WindowsSelectorImpl@3ae48017 processing 0 keys, 0 actions 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-32: action=null wakeup=false 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-32: Selector sun.nio.ch.WindowsSelectorImpl@3ae48017 waiting on select 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-76: Selector sun.nio.ch.WindowsSelectorImpl@38021066 processing 0 keys, 0 actions 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-76: action=null wakeup=false 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-36: Selector sun.nio.ch.WindowsSelectorImpl@142a7d96 woken up from select, 0/0 selected 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-36: Selector sun.nio.ch.WindowsSelectorImpl@142a7d96 processing 0 keys, 0 actions 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-36: action=null wakeup=false 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-36: Selector sun.nio.ch.WindowsSelectorImpl@142a7d96 waiting on select 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-31: Selector sun.nio.ch.WindowsSelectorImpl@4193294a woken up from select, 0/0 selected 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-31: Selector sun.nio.ch.WindowsSelectorImpl@4193294a processing 0 keys, 0 actions 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-31: action=null wakeup=false 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-31: Selector sun.nio.ch.WindowsSelectorImpl@4193294a waiting on select 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-31: Selector sun.nio.ch.WindowsSelectorImpl@4193294a woken up from select, 0/0 selected 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-31: Selector sun.nio.ch.WindowsSelectorImpl@4193294a processing 0 keys, 0 actions 03/14/2018 18:04:58.829 DEBUG jetty WebSocketContainer@1018635868-31: action=null wakeup=false

joakime commented 6 years ago

If you can build the jetty source, please try the current jetty-9.4.x (branch) HEAD. - commit 44546e5ca93c140e6f710ca08d5091ac499c9397 Or, if you can be a little patient, we are currently staging a potential 9.4.9 release with a possible hack for a recently identified Windows NIO bug.

joemokos commented 6 years ago

Thanks Joakim. I'll try to build the HEAD from source. Do you have an ETA on 9.4.9?

joemokos commented 6 years ago

Is the fix you're referring to for bug #2335?

joakime commented 6 years ago

Yes, PR #2335 is the source of the fix.

The 9.4.9 staged release should be available on oss.sonatype.org in a few hours.

joakime commented 6 years ago

The Jetty 9.4.9 stage is available.

Note: this is not an official release, and only represents a potential release. It will only be official once adequate testing has been performed and then promoted to general release.

https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.9.v20180315 Many many fixes, see above link for list.

Staging Repository: https://oss.sonatype.org/content/repositories/jetty-1370/

joemokos commented 6 years ago

Thanks Joakim.

I’ll give it a try.

Joe

From: Joakim Erdfelt [mailto:notifications@github.com] Sent: Thursday, March 15, 2018 11:29 AM To: eclipse/jetty.project jetty.project@noreply.github.com Cc: Joseph Mokos Joseph.Mokos@riverbed.com; Author author@noreply.github.com Subject: Re: [eclipse/jetty.project] 100% CPU usage in Selector using Jetty on Windows (#2205)

The Jetty 9.4.9 stage is available.

Note: this is not an official release, and only represents a potential release. It will only be official once adequate testing has been performed and then promoted to general release.

https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.9.v20180315 Many many fixes, see above link for list.

Staging Repository: https://oss.sonatype.org/content/repositories/jetty-1370/

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/eclipse/jetty.project/issues/2205#issuecomment-373416904, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Ai-XwA6R3c09pqFoghfzcv1PdCtREU2Yks5teojGgaJpZM4SMWmA.

joemokos commented 6 years ago

Converted over to 9.4.9. Will have to send update to our customer to determine if the 100% CPU problem is fixed. We were never able to recreate in-house. Assuming this fixes the problem, do you have any idea when 9.4.9 will be GA?

joakime commented 6 years ago

Jetty 9.4.9.v20180320 has been released with a hack for the identified Windows JVM Selector NIO Bug.

joemokos commented 6 years ago

Finally got a chance to test 9.4.9 at client site and am still running into the same issue. Following is a snippet from the log that was taken while the 100% CPU was occurring:

03/30/2018 18:15:47.202 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd woken up from select, 0/0 selected 03/30/2018 18:15:47.314 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd processing 0 keys, 0 updates 03/30/2018 18:15:47.314 DEBUG jetty WebSocketContainer@1444944344-39: updateable 0 03/30/2018 18:15:47.314 DEBUG jetty WebSocketContainer@1444944344-39: updates 0 03/30/2018 18:15:47.314 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd waiting on select 03/30/2018 18:15:47.202 DEBUG jetty WebSocketContainer@1444944344-36: Selector sun.nio.ch.WindowsSelectorImpl@d94c4f5 woken up from select, 0/0 selected 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-36: Selector sun.nio.ch.WindowsSelectorImpl@d94c4f5 processing 0 keys, 0 updates 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-36: updateable 0 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-36: updates 0 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-36: Selector sun.nio.ch.WindowsSelectorImpl@d94c4f5 waiting on select 03/30/2018 18:15:47.314 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd woken up from select, 0/0 selected 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd processing 0 keys, 0 updates 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-39: updateable 0 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-39: updates 0 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd waiting on select 03/30/2018 18:15:47.314 DEBUG jetty WebSocketContainer@1444944344-37: updateable 0 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-37: updates 0 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-37: Selector sun.nio.ch.WindowsSelectorImpl@3b6ce69c waiting on select 03/30/2018 18:15:47.203 DEBUG jetty WebSocketContainer@1444944344-41: Selector sun.nio.ch.WindowsSelectorImpl@3c509ad3 woken up from select, 0/0 selected 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-41: Selector sun.nio.ch.WindowsSelectorImpl@3c509ad3 processing 0 keys, 0 updates 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-41: updateable 0 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-41: updates 0 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-41: Selector sun.nio.ch.WindowsSelectorImpl@3c509ad3 waiting on select 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-37: Selector sun.nio.ch.WindowsSelectorImpl@3b6ce69c woken up from select, 0/0 selected 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-37: Selector sun.nio.ch.WindowsSelectorImpl@3b6ce69c processing 0 keys, 0 updates 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-37: updateable 0 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-37: updates 0 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-37: Selector sun.nio.ch.WindowsSelectorImpl@3b6ce69c waiting on select 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd woken up from select, 0/0 selected 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd processing 0 keys, 0 updates 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-39: updateable 0 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-39: updates 0 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd waiting on select 03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-36: Selector sun.nio.ch.WindowsSelectorImpl@d94c4f5 woken up from select, 0/0 selected 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-36: Selector sun.nio.ch.WindowsSelectorImpl@d94c4f5 processing 0 keys, 0 updates 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-36: updateable 0 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-36: updates 0 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-36: Selector sun.nio.ch.WindowsSelectorImpl@d94c4f5 waiting on select 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd woken up from select, 0/0 selected 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd processing 0 keys, 0 updates 03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-39: updateable 0 03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-39: updates 0 03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd waiting on select 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-37: Selector sun.nio.ch.WindowsSelectorImpl@3b6ce69c woken up from select, 0/0 selected 03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-37: Selector sun.nio.ch.WindowsSelectorImpl@3b6ce69c processing 0 keys, 0 updates 03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-37: updateable 0 03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-37: updates 0 03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-41: Selector sun.nio.ch.WindowsSelectorImpl@3c509ad3 woken up from select, 0/0 selected 03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-41: Selector sun.nio.ch.WindowsSelectorImpl@3c509ad3 processing 0 keys, 0 updates 03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-41: updateable 0 03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-41: updates 0 03/30/2018

joakime commented 6 years ago

Do you have a testcase that can cause this?

joemokos commented 6 years ago

Unfortunately, no. So far, I have not been able to recreate this in-house. I have only seen it at one of our customer's and only on Windows 2012 R2 machines.

joakime commented 6 years ago

What version of Java?

joemokos commented 6 years ago

jdk1.8.0_162 currently but we also saw the problem with jdk1.8.0_131

gregw commented 6 years ago

Here is that log sorted by thread:

03/30/2018 18:15:47.202 DEBUG jetty WebSocketContainer@1444944344-36: Selector sun.nio.ch.WindowsSelectorImpl@d94c4f5 woken up from select, 0/0 selected
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-36: Selector sun.nio.ch.WindowsSelectorImpl@d94c4f5 processing 0 keys, 0 updates 
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-36: Selector sun.nio.ch.WindowsSelectorImpl@d94c4f5 waiting on select
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-36: Selector sun.nio.ch.WindowsSelectorImpl@d94c4f5 woken up from select, 0/0 selected
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-36: updateable 0
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-36: updates 0
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-36: Selector sun.nio.ch.WindowsSelectorImpl@d94c4f5 processing 0 keys, 0 updates 
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-36: Selector sun.nio.ch.WindowsSelectorImpl@d94c4f5 waiting on select
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-36: updateable 0
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-36: updates 0

03/30/2018 18:15:47.314 DEBUG jetty WebSocketContainer@1444944344-37: updateable 0
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-37: Selector sun.nio.ch.WindowsSelectorImpl@3b6ce69c waiting on select
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-37: Selector sun.nio.ch.WindowsSelectorImpl@3b6ce69c woken up from select, 0/0 selected
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-37: updates 0
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-37: Selector sun.nio.ch.WindowsSelectorImpl@3b6ce69c processing 0 keys, 0 updates 
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-37: Selector sun.nio.ch.WindowsSelectorImpl@3b6ce69c waiting on select
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-37: Selector sun.nio.ch.WindowsSelectorImpl@3b6ce69c woken up from select, 0/0 selected
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-37: updateable 0
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-37: updates 0
03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-37: Selector sun.nio.ch.WindowsSelectorImpl@3b6ce69c processing 0 keys, 0 updates 
03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-37: updateable 0
03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-37: updates 0

03/30/2018 18:15:47.202 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd woken up from select, 0/0 selected
03/30/2018 18:15:47.314 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd processing 0 keys, 0 updates 
03/30/2018 18:15:47.314 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd waiting on select
03/30/2018 18:15:47.314 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd woken up from select, 0/0 selected
03/30/2018 18:15:47.314 DEBUG jetty WebSocketContainer@1444944344-39: updateable 0
03/30/2018 18:15:47.314 DEBUG jetty WebSocketContainer@1444944344-39: updates 0
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd processing 0 keys, 0 updates 
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd waiting on select
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd woken up from select, 0/0 selected
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-39: updateable 0
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-39: updates 0
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd processing 0 keys, 0 updates 
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd processing 0 keys, 0 updates 
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd waiting on select
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd woken up from select, 0/0 selected
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-39: updateable 0
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-39: updates 0
03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-39: Selector sun.nio.ch.WindowsSelectorImpl@a0305fd waiting on select
03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-39: updateable 0
03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-39: updates 0

03/30/2018 18:15:47.203 DEBUG jetty WebSocketContainer@1444944344-41: Selector sun.nio.ch.WindowsSelectorImpl@3c509ad3 woken up from select, 0/0 selected
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-41: Selector sun.nio.ch.WindowsSelectorImpl@3c509ad3 processing 0 keys, 0 updates 
03/30/2018 18:15:47.315 DEBUG jetty WebSocketContainer@1444944344-41: updateable 0
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-41: Selector sun.nio.ch.WindowsSelectorImpl@3c509ad3 waiting on select
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-41: Selector sun.nio.ch.WindowsSelectorImpl@3c509ad3 woken up from select, 0/0 selected
03/30/2018 18:15:47.316 DEBUG jetty WebSocketContainer@1444944344-41: updates 0
03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-41: Selector sun.nio.ch.WindowsSelectorImpl@3c509ad3 processing 0 keys, 0 updates 
03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-41: updateable 0
03/30/2018 18:15:47.317 DEBUG jetty WebSocketContainer@1444944344-41: updates 0
gregw commented 6 years ago

I cannot see how this is not a JVM bug. These are 4 idle selectors with each having no keys at all, so they should all definitely block in select until it is woken up. I guess conceivably we have a bug were we are waking up the selector when there is nothing to do?? If we gave you a version with extra instrumentation could you run that on your clients site to capture more info?

However, I'm really dubious we will be able to fix this in the main line of the project. We already have some code to work around windows bugs and I'll really loath to add more work arounds in such a core part of the code. In this case it looks like if there are zero keys then we'd need to wait on a separate mutex... but then we'd have a race, so we'd have to synchronize and we'd kill performance for the 99.999% of other users. At best I'm seeing a branch with a work around that you'd have to build/update yourself.

joemokos commented 6 years ago

Hi Greg, I can't disagree with anything you're saying. Those selects should block until the socket(s) is readable, are interrupted or are woken via wakeup. Let me check with my client to see if we can run a version with extra instrumentation.

I don't know if this is important or not or just a coincidence, but this only happens on 3 servers (our app is running on many). They are running windows 2012 R2. And it always seems to happen 2 hours after our app starts and the websocket connection is made. This does not happen (or hasn't) on windows 2012 servers.

joemokos commented 6 years ago

Hi Greg, I was looking at the log and I noticed threads 36, 37, 39 and 41 are idle and experiencing this problem. The log message generated by these threads says:

Selector sun.nio.ch.WindowsSelectorImpl@3b6ce69c woken up from select, 0/0 selected

The 0/0 shows the (# of keys in the selected keys set)/(# of keys in the key set). I understand why the # of keys in the selected set is 0 but shouldn't the # of keys in the key set be at least 1.

The only selector thread that is actually connected (42) seems to be functioning normally, ie "woken up from select, 1/1 selected".

Is this all normal behavior?

sbordet commented 6 years ago

That is why we think it's a JVM bug. Having 0/0 and not blocking in select() is abnormal behavior.

joemokos commented 6 years ago

Thanks for responding so quickly. Shouldn't the 2nd 0 be > 0. Does that mean the set of sockets we're interested in is empty? I wonder what the behavior is of select when the key set is empty. Would it block or return immediately?

sbordet commented 6 years ago

Yes the second zero should be > 0, i.e. there are no sockets managed by that selector. The selector should block in select() if there are no sockets - that is what happens when you start a server but have not yet incoming connections.

joakime commented 6 years ago

Looks like the nightly snapshot deployment has the instrumentation that @gregw added in commit 426fb955f8a634e591d2f2dda61768b13136f69d

Jetty Snapshot Repository URL: https://oss.sonatype.org/content/repositories/jetty-snapshots/

Please use version 9.4.10-20180403.160317-6 and see if there's anything from the logs that could help.

joemokos commented 6 years ago

Thanks Joakim. I'll get it deployed to my customer's site as soon as I get the go ahead.

joakime commented 6 years ago

Just a reminder, there are jetty-home and jetty-distribution SNAPSHOT artifacts available there too.

joemokos commented 6 years ago

Will both of those contain the additional instrumentation?

joakime commented 6 years ago
[distros]$ tar -zxf jetty-home-9.4.10-20180403.160523-6.tar.gz
[distros]$ cd jetty-home-9.4.10-SNAPSHOT/
[jetty-home-9.4.10-SNAPSHOT]$ ls
bin  etc  lib  license-eplv10-aslv20.html  modules  notice.html  start.jar  VERSION.txt
[jetty-home-9.4.10-SNAPSHOT]$ jar -xvf lib/jetty-util-9.4.10-SNAPSHOT.jar org/eclipse/jetty/version/build.properties
 inflated: org/eclipse/jetty/version/build.properties
[jetty-home-9.4.10-SNAPSHOT]$ cat org/eclipse/jetty/version/build.properties
buildNumber=42d9da56c0292467074789bd557b803f1adc7269
timestamp=1522771392088
version=9.4.10-SNAPSHOT
scmUrl=scm:git:https://github.com/eclipse/jetty.project.git/jetty-util

That was build from commit 42d9da56c0292467074789bd557b803f1adc7269, which includes the instrumentation commit 426fb955f8a634e591d2f2dda61768b13136f69d

So, Yes, both of those snapshot tarballs contain the instrumentation from commit 426fb955f8a634e591d2f2dda61768b13136f69d

joemokos commented 6 years ago

jetty-select.txt

I haven't been able to deploy the instrumented version yet but I was able to get some debug logging around the time the problem occurred. I've attached a jetty debug log snippet. The log starts at 18:26:33.523 and the problem occurs at 18:30:56.617 (line 15241 in the log). It appears that threads WebSocketContainer@1978341506-31, -36, -38 and -44 are functioning normally while threads -33, -34, -35 and -37 are blocked on a select. Then at 18:30:56.617, the blocked threads wake up and go into a tight loop. I can't tell if there are any clues in this log as to why this happens. I was hoping this may help you to identify the cause.

sbordet commented 6 years ago

I looked at the logs, but threads -33, -34, -35 and -37 have no prior logs - when they start logging they are already in a loop.

It is also curious that there is a ~2 second pause between 18:30:54.925 and 18:30:56.617 where the spin loop begins. Did you remove log lines in there ?

It would be great to know what those spinning threads were doing before looping.

ihor-zhakun commented 6 years ago

Hi

I've got the same problem, when very simple project (from the comment of related issue) after 2 hrs (+/- few minutes) got 100% CPU usage https://github.com/hazelcast/hazelcast/issues/7943#issuecomment-211280157

Env: Windows 10, jdk build 1.8.0_171-b11, tried also with jdk v71, v161 also different versions of hazelcast: 3.6.2, 3.8.8, 3.9.4, 3.10. Problem always is reproducing with 100% chance after 2 hrs of idle.

In debug there is next info, if compare data before and after CPU overhead: sun.nio.ch.WindowsSelectorImpl.SubSelector#poll0 before - readFds array is fulled with 0 after - readFds[0] == 1, value of readFds[1] == sun.nio.ch.WindowsSelectorImpl#wakeupSourceFd also method poll0 is executing instantly, without any timeout and thread is in endless loop without block or wait.

Any help on the problem? Or any advise what to debug else or try to change?

sbordet commented 6 years ago

@linkedz do you get this issue with the latest Jetty 9.4.9 (or later) ? Jetty 9.4.9 contains the fix for #2218.

Can you also scroll up in this issue and try the extra instrumentation that @gregw wrote to detect where is the fault ?

ihor-zhakun commented 6 years ago

I did get the same issue without jetty - but with hazelcast only with WindowsSelectorImpl class

sbordet commented 6 years ago

@linkedz ok so this is probably a Windows JDK bug. If you can run with the instrumentation code that @gregw wrote, we will have something concrete to file a JDK issue.

ihor-zhakun commented 6 years ago

@sbordet Instrumentation code written is about changes in jetty-io, while in my test case there is no dependency on jetty at all. Probably, it's possible to make some debug log using aspects on WindowsSelectorImpl class. But, after draft looking, it will be possible to do only with public method sun.nio.ch.SelectorImpl#select(long) and see, that it will be always executed in few nanosec

arshadwkhan commented 6 years ago

@sbordet i see the same issue with jetty 9.4.10 on a linux machine. Almost everyone pointing to jdk bug with nio. I see that netty has tried to address this with this fix https://github.com/netty/netty/pull/565/commits/292a186d169915418022253317d1b7c60d042bda

sbordet commented 6 years ago

@arshadwkhan the Netty fix is 6 years old, I hope in the meanwhile the bug has been fixed in the JDK.

If you see the same issue, please try 9.4.12.RC2 or the upcoming 9.4.12 final release, and if you still see the issue then we need detailed information.

Meanwhile we have fixed #2218 and #2755 that may have rendered this issue obsolete.

joemokos commented 6 years ago

When I first reported this bug the problem showed up in the websocket client side. Now I'm seeing it occasionally server side (on the qtp threads). Same pattern that I saw in my comment on Apr 2 - looping on select that returns 0 selected. I just upgraded to 9.4.11 yesterday hoping it will solve the problem. Question - is #2755 in 9.4.12.RC2 or 9.4.11? If 9.4.12.RC2, when will 9.4.12 be GA?

joakime commented 6 years ago

Tip.

The last commit in #2755 is commit b561ee27342a26b88b7f4d3aaf2d5175e79429e9

If you click on that commit in github, it lists the tags that commit is present in. So, yes, it shows up for the first time in 9.4.12.RC2

joakime commented 6 years ago

Also note that issue #2755 is for people that start/stop (repeatedly) the HttpClient very quickly (like you would see in unit tests). The fix for #2755 is focused only on that behavior.

sbordet commented 6 years ago

Well it's a generic fix for for thread interruption, that should happen very rarely, mostly in test scenarios.

arshadwkhan commented 6 years ago

@sbordet we are using embedded jetty server that listens on port 80 using serverconnector and have a single servlet derived from asyncproxyservlet that reverse proxies the request to target. Will the fix you mentioned above will have any bearing on the issue I mentioned ?

sbordet commented 6 years ago

@arshadwkhan I don't know because you did not say much about your problem.

Perhaps it is the same as the one in this issue, perhaps not.

Try the latest Jetty release and report back if you still see the problem.

arshadwkhan commented 6 years ago

@sbordet I am not able to reproduce it in dev env. This is only happening in production env. Can you tell me what all things you would require ?

In the meantime why not LOG following in ManagedSelector.java at WARNING level ? LOG.debug("Selector {} woken with none selected", selector);

arshadwkhan commented 6 years ago

@sbordet @joakime I captured following jetty debug logs to see the behaviour of selector on windows machine. I sent a single http request and saw 6 "woken with none selected" log entries . Is this the correct behaviour ?

jetty version : 9.4.10 java version "1.8.0_121"

[2018-08-31@10:44:11|qtp205721196-23|DEBUG|ManagedSelector] : Selector sun.nio.ch.WindowsSelectorImpl@2d01dd9d woken with none selected

[2018-08-31@10:44:11|qtp205721196-23|DEBUG|ManagedSelector] : Selector sun.nio.ch.WindowsSelectorImpl@2d01dd9d woken with none selected

[2018-08-31@10:44:11|qtp205721196-24|DEBUG|ManagedSelector] : Selector sun.nio.ch.WindowsSelectorImpl@2d01dd9d woken with none selected

[2018-08-31@10:44:11|qtp205721196-20|DEBUG|ManagedSelector] : Selector sun.nio.ch.WindowsSelectorImpl@2d01dd9d woken with none selected
sbordet commented 6 years ago

Yes it is normal behavior and that's why we cannot change the log level to WARN.

We want evidence that you have a CPU spin.

We would like to see multiple thread dumps taken 1-2 seconds apart, we would like to see a JMC snapshot that shows what thread is consuming 100% CPU, we would like to see a Server dump when it happens. We can start with that and refine as we go along.

Or, you move to an operative system that does not require us to workaround JDK bugs :grinning:

arshadwkhan commented 6 years ago

The issue is only observed in production env .The only thing I can manage from the production env is thread dump and top output showing the thread causing high cpu.

Where would I go from Oracle Linux 6 (x86_64) UEK Release 3 ?

sbordet commented 6 years ago

@arshadwkhan you are on Linux? Then open a new issue. This one is for Windows. From your production environment you should be able to take also server dumps. We need evidence, otherwise it could be a bug in your application, not in Jetty.

arshadwkhan commented 6 years ago

@sbordet when you say server dump does it mean thread dump ?

sbordet commented 6 years ago

No, I mean this: https://www.eclipse.org/jetty/documentation/9.4.x/jetty-dump-tool.html