quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.79k stars 2.68k forks source link

Thread locked after stress test #5443

Closed yuhaibohotmail closed 4 years ago

yuhaibohotmail commented 4 years ago

Describe the bug (Describe the problem clearly and concisely.)

I did a simple stress test,then the program keeps giving alarm log.

21:59:37 WARNING [io.ve.co.im.BlockedThreadChecker] (vertx-blocked-thread-checker) Thread Thread[vert.x-worker-thread-9,5,main]=Thread[vert.x-worker-thread-9,5,main] has been blocked for 3382564 ms, time limit is 60000 ms: io.vertx.core.VertxException: Thread blocked
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:502)
        at io.quarkus.resteasy.runtime.standalone.VertxBlockingOutput.awaitWriteable(VertxBlockingOutput.java:111)
        at io.quarkus.resteasy.runtime.standalone.VertxBlockingOutput.write(VertxBlockingOutput.java:71)
        at io.quarkus.resteasy.runtime.standalone.VertxHttpResponse.writeBlocking(VertxHttpResponse.java:179)
        at io.quarkus.resteasy.runtime.standalone.VertxOutputStream.write(VertxOutputStream.java:69)
        at io.quarkus.resteasy.runtime.standalone.VertxOutputStream.write(VertxOutputStream.java:42)
        at org.jboss.resteasy.util.CommitHeaderOutputStream.write(CommitHeaderOutputStream.java:64)
        at org.jboss.resteasy.util.DelegatingOutputStream.write(DelegatingOutputStream.java:42)
        at org.jboss.resteasy.plugins.providers.jsonb.JsonBindingProvider.writeTo(JsonBindingProvider.java:137)
        at org.jboss.resteasy.core.interception.jaxrs.AbstractWriterInterceptorContext.writeTo(AbstractWriterInterceptorContext.java:193)
        at org.jboss.resteasy.core.interception.jaxrs.ServerWriterInterceptorContext.writeTo(ServerWriterInterceptorContext.java:64)
        at org.jboss.resteasy.core.interception.jaxrs.AbstractWriterInterceptorContext.proceed(AbstractWriterInterceptorContext.java:155)
        at org.jboss.resteasy.core.ServerResponseWriter.lambda$writeNomapResponse$2(ServerResponseWriter.java:156)
        at org.jboss.resteasy.core.ServerResponseWriter$$Lambda$417/1780592217.run(Unknown Source)
        at org.jboss.resteasy.core.interception.jaxrs.ContainerResponseContextImpl.filter(ContainerResponseContextImpl.java:404)
        at org.jboss.resteasy.core.ServerResponseWriter.executeFilters(ServerResponseWriter.java:232)
        at org.jboss.resteasy.core.ServerResponseWriter.writeNomapResponse(ServerResponseWriter.java:97)
        at org.jboss.resteasy.core.ServerResponseWriter.writeNomapResponse(ServerResponseWriter.java:70)
        at org.jboss.resteasy.core.SynchronousDispatcher.writeResponse(SynchronousDispatcher.java:578)
        at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:508)
        at org.jboss.resteasy.core.SynchronousDispatcher.lambda$invoke$4(SynchronousDispatcher.java:252)
        at org.jboss.resteasy.core.SynchronousDispatcher$$Lambda$332/1061176646.run(Unknown Source)
        at org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:153)
        at org.jboss.resteasy.core.SynchronousDispatcher$$Lambda$338/1804746610.get(Unknown Source)
        at org.jboss.resteasy.core.interception.jaxrs.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:363)
        at org.jboss.resteasy.core.SynchronousDispatcher.preprocess(SynchronousDispatcher.java:156)
        at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:238)
        at io.quarkus.resteasy.runtime.standalone.RequestDispatcher.service(RequestDispatcher.java:73)
        at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.dispatch(VertxRequestHandler.java:109)
        at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.dispatchRequestContext(VertxRequestHandler.java:84)
        at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.lambda$handle$0(VertxRequestHandler.java:71)
        at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler$$Lambda$313/2064123549.handle(Unknown Source)
        at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:316)
        at io.vertx.core.impl.ContextImpl$$Lambda$327/865803395.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)

After the test ,All requests are unresponsive . jstack log:

"mysql-cj-abandoned-connection-cleanup" #54 daemon prio=5 os_prio=0 tid=0x00007ff55c042800 nid=0xfaa in Object.wait() [0x00007ff60e3e8000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
    - locked <0x00000006ca58bd08> (a java.lang.ref.ReferenceQueue$Lock)
    at com.mysql.cj.jdbc.AbandonedConnectionCleanupThread.run(AbandonedConnectionCleanupThread.java:85)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - <0x00000006ca58b8f8> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"vert.x-worker-thread-12" #53 prio=5 os_prio=0 tid=0x00007ff58c007000 nid=0xfa9 in Object.wait() [0x00007ff60e8e8000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    at java.lang.Object.wait(Object.java:502)
    at io.quarkus.resteasy.runtime.standalone.VertxBlockingOutput.awaitWriteable(VertxBlockingOutput.java:111)
    at io.quarkus.resteasy.runtime.standalone.VertxBlockingOutput.write(VertxBlockingOutput.java:71)
    - locked <0x00000006c8739c58> (a io.vertx.core.http.impl.Http1xServerConnection)
    at io.quarkus.resteasy.runtime.standalone.VertxHttpResponse.writeBlocking(VertxHttpResponse.java:179)
    at io.quarkus.resteasy.runtime.standalone.VertxOutputStream.write(VertxOutputStream.java:69)
    at io.quarkus.resteasy.runtime.standalone.VertxOutputStream.write(VertxOutputStream.java:42)
    at org.jboss.resteasy.util.CommitHeaderOutputStream.write(CommitHeaderOutputStream.java:64)
    at org.jboss.resteasy.util.DelegatingOutputStream.write(DelegatingOutputStream.java:42)
    at org.jboss.resteasy.plugins.providers.jsonb.JsonBindingProvider.writeTo(JsonBindingProvider.java:137)
    at org.jboss.resteasy.core.interception.jaxrs.AbstractWriterInterceptorContext.writeTo(AbstractWriterInterceptorContext.java:193)
    at org.jboss.resteasy.core.interception.jaxrs.ServerWriterInterceptorContext.writeTo(ServerWriterInterceptorContext.java:64)
    at org.jboss.resteasy.core.interception.jaxrs.AbstractWriterInterceptorContext.proceed(AbstractWriterInterceptorContext.java:155)
    at org.jboss.resteasy.core.ServerResponseWriter.lambda$writeNomapResponse$2(ServerResponseWriter.java:156)
    at org.jboss.resteasy.core.ServerResponseWriter$$Lambda$417/1780592217.run(Unknown Source)
    at org.jboss.resteasy.core.interception.jaxrs.ContainerResponseContextImpl.filter(ContainerResponseContextImpl.java:404)
    - locked <0x00000007bea82098> (a org.jboss.resteasy.core.interception.jaxrs.ContainerResponseContextImpl)
    at org.jboss.resteasy.core.ServerResponseWriter.executeFilters(ServerResponseWriter.java:232)
    at org.jboss.resteasy.core.ServerResponseWriter.writeNomapResponse(ServerResponseWriter.java:97)
    at org.jboss.resteasy.core.ServerResponseWriter.writeNomapResponse(ServerResponseWriter.java:70)
    at org.jboss.resteasy.core.SynchronousDispatcher.writeResponse(SynchronousDispatcher.java:578)
    at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:508)
    at org.jboss.resteasy.core.SynchronousDispatcher.lambda$invoke$4(SynchronousDispatcher.java:252)
    at org.jboss.resteasy.core.SynchronousDispatcher$$Lambda$332/1061176646.run(Unknown Source)
    at org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:153)
    at org.jboss.resteasy.core.SynchronousDispatcher$$Lambda$338/1804746610.get(Unknown Source)
    at org.jboss.resteasy.core.interception.jaxrs.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:363)
    - locked <0x00000007bea82130> (a org.jboss.resteasy.core.interception.jaxrs.PreMatchContainerRequestContext)
    at org.jboss.resteasy.core.SynchronousDispatcher.preprocess(SynchronousDispatcher.java:156)
    at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:238)
    at io.quarkus.resteasy.runtime.standalone.RequestDispatcher.service(RequestDispatcher.java:73)
    at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.dispatch(VertxRequestHandler.java:109)
    at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.dispatchRequestContext(VertxRequestHandler.java:84)
    at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.lambda$handle$0(VertxRequestHandler.java:71)
    at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler$$Lambda$313/2064123549.handle(Unknown Source)
    at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:316)
    at io.vertx.core.impl.ContextImpl$$Lambda$327/865803395.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - <0x00000006c8863b08> (a java.util.concurrent.ThreadPoolExecutor$Worker)

Expected behavior (Describe the expected behavior clearly and concisely.)

Actual behavior (Describe the actual behavior clearly and concisely.)

To Reproduce Steps to reproduce the behavior:

  1. My program was modified from hibernate-orm-panache-resteasy in quickstart. I add a restful service returns all records of the table,there are 800 records in the table.
    @GET
    public List<Student> findAll() {
        List<Student> list = Student.findAll().list();
        if (list == null) {
            throw new WebApplicationException("Fruit  does not exist.", 404);
        }
        return list;
    }
  2. Then run in JVM mode.
  3. Did a simple stress test .
    wrk -t12 -c2000 -d30s  http://10.32.1.***:8888/rest/students/

Configuration

# Add your application.properties here, if applicable.
quarkus.datasource.url=jdbc:mysql://10.32.1.***:3306/xyz
quarkus.datasource.driver=com.mysql.cj.jdbc.Driver
    <dependency>
      <groupId>io.quarkus</groupId>
      <artifactId>quarkus-jdbc-mysql</artifactId>
    </dependency>

Screenshots (If applicable, add screenshots to help explain your problem.)

Environment (please complete the following information):

Additional context

The full jstack dump info jstack-l.txt

geoand commented 4 years ago

@johnaohara have you seen anything similar?

johnaohara commented 4 years ago

@geoand looks similar to #4173

geoand commented 4 years ago

Which was fixed however

johnaohara commented 4 years ago

@geoand Does your reproducer fail for master now? (https://github.com/geoand/quarkus-vertx-threads-blocked)

geoand commented 4 years ago

Let me check

johnaohara commented 4 years ago

Have just re-run against 1.0.0.CR1 and is working as expected

geoand commented 4 years ago

I just ran it my reproducer against master and it worked just fine.

johnaohara commented 4 years ago

I looks like there is a race condition in request.response().drainHandler(). I have opened a pr #5451.

yuhaibohotmail commented 4 years ago

@johnaohara I've simplified my program and can reproduce it in several of my environments.

My endpoint:

@Path("/hello")
public class ExampleResource {

    @GET
    @Produces(MediaType.TEXT_PLAIN)
    public String hello() {
        StringBuffer buffer = new StringBuffer();
        try {
            Thread.sleep(10);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        buffer.append("[");
        for(int i = 0 ;i <1000;i++ ) {
            buffer.append("{\"id\":1,\"name\":\"tom\",\"sex\":\"f\",\"testCode\":\"12345\"},");
        }
        buffer.append("{\"id\":1,\"name\":\"tom\",\"sex\":\"f\",\"testCode\":\"12345\"} ]");
        return buffer.toString();
    }
}

Run wrk

[root@localhost wrk]#  wrk -t20 -c2000 -d30s  http://localhost:8888/hello
Running 30s test @ http://localhost:8888/hello
  20 threads and 2000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   652.20ms   95.72ms   1.03s    79.64%
    Req/Sec   180.39    180.45     0.99k    85.29%
  37192 requests in 30.10s, 1.74GB read
  Socket errors: connect 999, read 0, write 0, timeout 0
Requests/sec:   1235.66
Transfer/sec:     59.21MB

When the wrk is running, the endpoint can be accessed through the browser.
Once the wrk is finished, the program reports some exception and then does not respond to the request.

2019-11-14 03:32:42,126 INFO  [io.quarkus] (main) load-test 1.0.0-SNAPSHOT (running on Quarkus 1.0.0.CR1) started in 1.222s. Listening on: http://0.0.0.0:8888
2019-11-14 03:32:42,146 INFO  [io.quarkus] (main) Profile prod activated.
2019-11-14 03:32:42,146 INFO  [io.quarkus] (main) Installed features: [cdi, resteasy]
2019-11-14 03:33:16,424 SEVERE [io.ver.cor.imp.ContextImpl] (vert.x-eventloop-thread-0) Unhandled exception: java.lang.IllegalMonitorStateException
        at java.lang.Object.notify(Native Method)
        at io.quarkus.resteasy.runtime.standalone.VertxBlockingOutput$1.handle(VertxBlockingOutput.java:34)
        at io.quarkus.resteasy.runtime.standalone.VertxBlockingOutput$1.handle(VertxBlockingOutput.java:25)
        at io.vertx.core.http.impl.HttpServerResponseImpl.handleClosed(HttpServerResponseImpl.java:616)
        at io.vertx.core.http.impl.HttpServerResponseImpl.handleException(HttpServerResponseImpl.java:590)
        at io.vertx.core.http.impl.HttpServerRequestImpl.handleException(HttpServerRequestImpl.java:577)
        at io.vertx.core.http.impl.Http1xServerConnection.handleClosed(Http1xServerConnection.java:413)
        at io.vertx.core.net.impl.VertxHandler.lambda$channelInactive$4(VertxHandler.java:162)
        at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:369)
        at io.vertx.core.impl.EventLoopContext.execute(EventLoopContext.java:43)
        at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:232)
        at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:224)
        at io.vertx.core.net.impl.VertxHandler.channelInactive(VertxHandler.java:162)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236)
        at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:393)
        at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:358)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1417)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
        at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:913)
        at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:819)
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:510)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:518)
        at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)

2019-11-14 03:33:16,425 SEVERE [io.ver.cor.imp.ContextImpl] (vert.x-eventloop-thread-0) Unhandled exception: java.lang.IllegalMonitorStateException
        at java.lang.Object.notify(Native Method)
        at io.quarkus.resteasy.runtime.standalone.VertxBlockingOutput$1.handle(VertxBlockingOutput.java:34)

source code load-test.zip

johnaohara commented 4 years ago

@yuhaibohotmail thanks a lot for the reproducer, I will use it to test a potential fix.

johnaohara commented 4 years ago

@yuhaibohotmail what hardware are you running on? I am not reproducing the error on my local machine, but will also test in our lab

johnaohara commented 4 years ago

I can re-create now

yuhaibohotmail commented 4 years ago

@johnaohara Can you reproduce it now?

I tested it in two environments.

  1. Our private cloud: 8CPU/16GMemory/ Centos 7
    The code and wrk run on the same machine
  2. My VM: 1CPU/4GMemory/ Centos 7 . Run on VMware 15.5 The code and wrk run on the same machine
johnaohara commented 4 years ago

@yuhaibohotmail I have your test running without hanging the application. Now waiting on CI and approval for the PR. Thanks

johnaohara commented 4 years ago

@yuhaibohotmail #5451 fixes this issue and will be included in the next release. Thank you for your help