amplab / velox-modelserver

http://amplab.github.io/velox-modelserver/
Apache License 2.0
110 stars 26 forks source link

Jetty QueuedThreadPool RejectedExecutionException #20

Closed dcrankshaw closed 9 years ago

dcrankshaw commented 9 years ago

For some reason we are getting RejectedExecutionExceptions thrown in Jetty's QueuedThreadPool.

I'm not sure exactly what's going on, but we only see it when the system is at high load. I'm assuming we've reached the limit of some resource, but I'm still tracking down which resource it is. I suspect it might have something to do with reaching the queue limit or something.

Here is a representative stack trace:

WARN  [2014-11-03 21:38:47,436] edu.berkeley.veloxms.jetty.util.thread.QueuedThreadPool: dw{STARTED,8<=124<=1024,i=0,q=1024} rejected AC.ExReadCB@2c056c65
WARN  [2014-11-03 21:38:47,441] edu.berkeley.veloxms.jetty.util.thread.QueuedThreadPool: dw{STARTED,8<=124<=1024,i=0,q=1024} rejected AC.ExReadCB@186d0670
WARN  [2014-11-03 21:38:47,675] edu.berkeley.veloxms.jetty.util.thread.QueuedThreadPool: dw{STARTED,8<=140<=1024,i=0,q=1024} rejected AC.ExReadCB@d240d9d
WARN  [2014-11-03 21:38:47,996] edu.berkeley.veloxms.jetty.io.SelectorManager: Could not process key for channel java.nio.channels.SocketChannel[connected local=/10.143.219.252:8080 remote=/10.147.1.83:34687]
! java.util.concurrent.RejectedExecutionException: AC.ExReadCB@407e9e8e
! at edu.berkeley.veloxms.jetty.util.thread.QueuedThreadPool.execute(QueuedThreadPool.java:357) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:407) ~[veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.FillInterest.fillable(FillInterest.java:78) ~[veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.SelectChannelEndPoint.onSelected(SelectChannelEndPoint.java:109) ~[veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.SelectorManager$ManagedSelector.processKey(SelectorManager.java:506) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.SelectorManager$ManagedSelector.select(SelectorManager.java:463) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.SelectorManager$ManagedSelector.run(SelectorManager.java:428) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:601) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:532) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
WARN  [2014-11-03 21:38:48,003] edu.berkeley.veloxms.jetty.io.SelectorManager: Could not process key for channel java.nio.channels.SocketChannel[connected local=/10.143.219.252:8080 remote=/10.143.223.222:60187]
! java.util.concurrent.RejectedExecutionException: AC.ExReadCB@d240d9d
! at edu.berkeley.veloxms.jetty.util.thread.QueuedThreadPool.execute(QueuedThreadPool.java:357) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:407) ~[veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.FillInterest.fillable(FillInterest.java:78) ~[veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.SelectChannelEndPoint.onSelected(SelectChannelEndPoint.java:109) ~[veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.SelectorManager$ManagedSelector.processKey(SelectorManager.java:506) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.SelectorManager$ManagedSelector.select(SelectorManager.java:463) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.SelectorManager$ManagedSelector.run(SelectorManager.java:428) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:601) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:532) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
WARN  [2014-11-03 21:38:48,011] edu.berkeley.veloxms.jetty.io.SelectorManager: Could not process key for channel java.nio.channels.SocketChannel[connected local=/10.143.219.252:8080 remote=/10.143.223.222:58371]
! java.util.concurrent.RejectedExecutionException: AC.ExReadCB@186d0670
! at edu.berkeley.veloxms.jetty.util.thread.QueuedThreadPool.execute(QueuedThreadPool.java:357) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:407) ~[veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.FillInterest.fillable(FillInterest.java:78) ~[veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.SelectChannelEndPoint.onSelected(SelectChannelEndPoint.java:109) ~[veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.SelectorManager$ManagedSelector.processKey(SelectorManager.java:506) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.SelectorManager$ManagedSelector.select(SelectorManager.java:463) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.io.SelectorManager$ManagedSelector.run(SelectorManager.java:428) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:601) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at edu.berkeley.veloxms.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:532) [veloxms-core-0.0.1-SNAPSHOT.jar:na]
! at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]

Here is the line in Jetty code where the warning logging message is catching the exception and logging. And this the the link to the QueuedThreadPool source code.

dcrankshaw commented 9 years ago

I'm still not quite sure why this is happening, but it's definitely the result of exceeding some resource limit. For now I have been avoiding it by throttling the requests on the client side, but it would be good to figure this out. When I see this error, the requests that can't be processed are silently dropped on the floor, which is definitely not the right failure mode.

dcrankshaw commented 9 years ago

This is subsumed by #35.