wso2 / api-manager

All issues, tasks, improvements and new features of WSO2 API Manager
Apache License 2.0
34 stars 8 forks source link

WS02 APIM 4.1.0: High CPU Utilization over 100% and HTTP-Sender I/O dispatcher threads consuming %CPU #2956

Open anupbvr opened 3 months ago

anupbvr commented 3 months ago

Description

WSO2 APIM 4.1.0 running on OpenJDK 11 continues to shows 100% more CPU utilization when being used in an environment with average throughput. CPU usage will remain in more than 100% figures and the issue will be resolved after restarting the server. After running for couple of days, the problem is re occurring and only a restart is the work around now.

Following threads are using more CPU.

- %CPU  CPU  NI S     TIME     PID     TID
- 
- 65.3   -   0 R 98-06:31:07 1786915 1787000  (Decimal ->  **1b4478**)
- 
- 78.9   -   0 R 118-14:04:13 1786915 1787001 (Decimal -> **1b4479**)

Following is the highlight of thread dump for a thread using more CPU.

"HTTP-Sender I/O dispatcher-2" #69 prio=5 os_prio=0 cpu=**10245828487.59ms** elapsed=12983070.78s tid=0x00007f325835c000 nid=0x**1b4479** runnable  [0x00007f3241cc7000]
       java.lang.Thread.State: RUNNABLE
        at java.lang.Throwable.fillInStackTrace(java.base@11.0.17/Native Method)
        at java.lang.Throwable.fillInStackTrace(java.base@11.0.17/Throwable.java:787)
        - **locked** <0x00000000f28b12b8> (a org.apache.http.MalformedChunkCodingException)
        at java.lang.Throwable.<init>(java.base@11.0.17/Throwable.java:270)
        at java.lang.Exception.<init>(java.base@11.0.17/Exception.java:66)
        at java.io.IOException.<init>(java.base@11.0.17/IOException.java:58)
        at org.apache.http.MalformedChunkCodingException.<init>(MalformedChunkCodingException.java:54)
        at org.apache.http.impl.nio.codecs.ChunkDecoder.readChunkHead(ChunkDecoder.java:112)
        at org.apache.http.impl.nio.codecs.ChunkDecoder.read(ChunkDecoder.java:205)
        at org.apache.synapse.transport.passthru.Pipe.produce(Pipe.java:250)
        at org.apache.synapse.transport.passthru.TargetResponse.read(TargetResponse.java:164)
        at org.apache.synapse.transport.passthru.TargetHandler.inputReady(TargetHandler.java:606)
        at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)

Steps to Reproduce

Thread Usage shows HTTP-Sender I/O dispatcher threads consuming more CPU and and all other threads are showing normal figures.

%CPU  CPU  NI S     TIME     PID     TID

65.3   -   0 R 98-06:31:07 1786915 1787000  (Decimal ->  **1b4478**)

78.9   -   0 R 118-14:04:13 1786915 1787001 (Decimal -> **1b4479**)

 1.0   -   0 R 1-13:32:03 1786915 1787002   (Decimal -> 1b447a)

 3.9   -   0 R 5-21:18:34 1786915 1787003   (Decimal -> 1b447b)

 1.7   -   0 S 2-15:00:19 1786915 1787200   (Decimal -> 1b4540)

> 1.5   -   0 S 2-09:38:23 1786915 1786922   (Decimal -> 1b442a)

Affected Component

APIM

Version

4.1.0

Environment Details (with versions)

Relevant Log Output

HTTP-Sender I/O dispatcher-1" #68 prio=5 os_prio=0 cpu=**8490642866.44ms** elapsed=12983070.78s tid=0x00007f325801c800 nid=0x**1b4478** runnable  [0x00007f3241dc7000]
   java.lang.Thread.State: RUNNABLE
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280)
    at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
    at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
    at java.lang.Thread.run(java.base@11.0.17/Thread.java:829)

"HTTP-Sender I/O dispatcher-2" #69 prio=5 os_prio=0 cpu=**10245828487.59ms** elapsed=12983070.78s tid=0x00007f325835c000 nid=0x**1b4479** runnable  [0x00007f3241cc7000]
       java.lang.Thread.State: RUNNABLE
        at java.lang.Throwable.fillInStackTrace(java.base@11.0.17/Native Method)
        at java.lang.Throwable.fillInStackTrace(java.base@11.0.17/Throwable.java:787)
        - **locked** <0x00000000f28b12b8> (a org.apache.http.MalformedChunkCodingException)
        at java.lang.Throwable.<init>(java.base@11.0.17/Throwable.java:270)
        at java.lang.Exception.<init>(java.base@11.0.17/Exception.java:66)
        at java.io.IOException.<init>(java.base@11.0.17/IOException.java:58)
        at org.apache.http.MalformedChunkCodingException.<init>(MalformedChunkCodingException.java:54)
        at org.apache.http.impl.nio.codecs.ChunkDecoder.readChunkHead(ChunkDecoder.java:112)
        at org.apache.http.impl.nio.codecs.ChunkDecoder.read(ChunkDecoder.java:205)
        at org.apache.synapse.transport.passthru.Pipe.produce(Pipe.java:250)
        at org.apache.synapse.transport.passthru.TargetResponse.read(TargetResponse.java:164)
        at org.apache.synapse.transport.passthru.TargetHandler.inputReady(TargetHandler.java:606)
        at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
        at org.apache.synapse.transport.http.conn.LoggingNHttpClientConnection.consumeInput(LoggingNHttpClientConnection.java:115)
        at org.apache.synapse.transport.passthru.ClientIODispatch.onInputReady(ClientIODispatch.java:83)
        at org.apache.synapse.transport.passthru.ClientIODispatch.onInputReady(ClientIODispatch.java:41)
        at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
        at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
        at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
        at java.lang.Thread.run(java.base@11.0.17/Thread.java:829)
"HTTP-Sender I/O dispatcher-3" #70 prio=5 os_prio=0 cpu=**135099517.41ms** elapsed=12983070.78s tid=0x00007f3258230000 nid=0x1b447a runnable  [0x00007f3241bc6000]
   java.lang.Thread.State: RUNNABLE
    at org.apache.http.impl.nio.codecs.ChunkDecoder.read(ChunkDecoder.java:205)
    at org.apache.synapse.transport.passthru.Pipe.produce(Pipe.java:250)
    at org.apache.synapse.transport.passthru.TargetResponse.read(TargetResponse.java:164)
    at org.apache.synapse.transport.passthru.TargetHandler.inputReady(TargetHandler.java:606)
    at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
    at org.apache.synapse.transport.http.conn.LoggingNHttpClientConnection.consumeInput(LoggingNHttpClientConnection.java:115)
    at org.apache.synapse.transport.passthru.ClientIODispatch.onInputReady(ClientIODispatch.java:83)
    at org.apache.synapse.transport.passthru.ClientIODispatch.onInputReady(ClientIODispatch.java:41)
    at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
    at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
    at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
    at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
    at java.lang.Thread.run(java.base@11.0.17/Thread.java:829)

"HTTP-Sender I/O dispatcher-4" #71 prio=5 os_prio=0 cpu=**508690354.93ms** elapsed=12983070.78s tid=0x00007f3258231000 nid=0x1b447b runnable  [0x00007f3241ac5000]
   java.lang.Thread.State: RUNNABLE
    at java.lang.Throwable.fillInStackTrace(java.base@11.0.17/Native Method)
    at java.lang.Throwable.fillInStackTrace(java.base@11.0.17/Throwable.java:787)
    - **locked** <0x00000000f244f4b0> (a org.apache.http.MalformedChunkCodingException)
    at java.lang.Throwable.<init>(java.base@11.0.17/Throwable.java:270)
    at java.lang.Exception.<init>(java.base@11.0.17/Exception.java:66)
    at java.io.IOException.<init>(java.base@11.0.17/IOException.java:58)
    at org.apache.http.MalformedChunkCodingException.<init>(MalformedChunkCodingException.java:54)
    at org.apache.http.impl.nio.codecs.ChunkDecoder.readChunkHead(ChunkDecoder.java:112)
    at org.apache.http.impl.nio.codecs.ChunkDecoder.read(ChunkDecoder.java:205)
    at org.apache.synapse.transport.passthru.Pipe.produce(Pipe.java:250)
    at org.apache.synapse.transport.passthru.TargetResponse.read(TargetResponse.java:164)
    at org.apache.synapse.transport.passthru.TargetHandler.inputReady(TargetHandler.java:606)
    at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
    at org.apache.synapse.transport.http.conn.LoggingNHttpClientConnection.consumeInput(LoggingNHttpClientConnection.java:115)
    at org.apache.synapse.transport.passthru.ClientIODispatch.onInputReady(ClientIODispatch.java:83)
    at org.apache.synapse.transport.passthru.ClientIODispatch.onInputReady(ClientIODispatch.java:41)
    at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
    at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
    at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
    at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
    at java.lang.Thread.run(java

Related Issues

No response

Suggested Labels

wso

anupbvr commented 2 months ago

We took the head dump and found the below details in HTTP-Sender I/O dispatcher-3 under the IOSessionImpl object -> bufferStatus -> buffer ->

hb byte [8192] @0xc68ae770 //response from back end api here [only a chunk of around 1024 bytes]// with shallow and retained heap as 8208.

Similar bufferStatus found for HTTP-Sender I/O dispatcher-4 thread. Please note, both of them are executed for the same API request and dealing with the API response in chunks. Couple of threads were executing org.apache.http.impl.nio.codecs.ChunkDecoder and couple of them are in waiting state. But none of the HTTP sender dispatcher threads were in MalformedChunkCodingException class.

We have collected one more thread dump and found the same IO dispatcher threads occupying more CPU. This time 3 out of 4 HTTP Sender IO threads are occupying the CPU. When we took the thread dump again (after taking heap dump) we found one of the thread executing MalformedChunkCodingException.

NI S     TIME     PID     TID
27.9   -   0 R 8-10:56:54 1675674 1675759
62.0   -   0 R 18-18:07:14 1675674 1675760
49.0   -   0 R 14-19:47:42 1675674 1675761

Here are the new thread dump for those.

> HTTP-Sender I/O dispatcher-2" #69 prio=5 os_prio=0 cpu=730598183.23ms elapsed=2611174.26s tid=0x00007fa8b801f800 nid=0x1991ef runnable  [0x00007fa8922fb000]
>    java.lang.Thread.State: RUNNABLE
>     at java.lang.Throwable.fillInStackTrace(java.base@11.0.17/Native Method)
>     at java.lang.Throwable.fillInStackTrace(java.base@11.0.17/Throwable.java:787)
>     - locked <0x00000000ea570ba0> (a org.apache.http.MalformedChunkCodingException)
>     at java.lang.Throwable.<init>(java.base@11.0.17/Throwable.java:270)
>     at java.lang.Exception.<init>(java.base@11.0.17/Exception.java:66)
>     at java.io.IOException.<init>(java.base@11.0.17/IOException.java:58)
>     at org.apache.http.MalformedChunkCodingException.<init>(MalformedChunkCodingException.java:54)
>     at org.apache.http.impl.nio.codecs.ChunkDecoder.readChunkHead(ChunkDecoder.java:112)
>     at org.apache.http.impl.nio.codecs.ChunkDecoder.read(ChunkDecoder.java:205)
>     at org.apache.synapse.transport.passthru.Pipe.produce(Pipe.java:250)
>     at org.apache.synapse.transport.passthru.TargetResponse.read(TargetResponse.java:164)
>     at org.apache.synapse.transport.passthru.TargetHandler.inputReady(TargetHandler.java:606)
>     at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
>     at org.apache.synapse.transport.http.conn.LoggingNHttpClientConnection.consumeInput(LoggingNHttpClientConnection.java:115)
>     at org.apache.synapse.transport.passthru.ClientIODispatch.onInputReady(ClientIODispatch.java:83)
>     at org.apache.synapse.transport.passthru.ClientIODispatch.onInputReady(ClientIODispatch.java:41)
>     at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
>     at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
>     at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
>     at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
>     at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
>     at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
>     at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
>     at java.lang.Thread.run(java.base@11.0.17/Thread.java:829)

Other threads...


"HTTP-Sender I/O dispatcher-3" #70 prio=5 os_prio=0 cpu=1620418247.95ms elapsed=2611174.26s tid=0x00007fa8b8021000 nid=0x1991f0 runnable  [0x00007fa8921fa000]
   java.lang.Thread.State: RUNNABLE
    at ```
sun.nio.ch.EPoll.wait(java.base@11.0.17/Native Method)
    at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@11.0.17/EPollSelectorImpl.java:120)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@11.0.17/SelectorImpl.java:124)
    - locked <0x00000000c92f0ae8> (a sun.nio.ch.Util$2)
    - locked <0x00000000c92f0988> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(java.base@11.0.17/SelectorImpl.java:136)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:255)
    at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
    at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
    at java.lang.Thread.run(java.base@11.0.17/Thread.java:829)

"HTTP-Sender I/O dispatcher-4" #71 prio=5 os_prio=0 cpu=1280846024.20ms elapsed=2611174.26s tid=0x00007fa8b8023000 nid=0x1991f1 runnable  [0x00007fa8920f9000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.FileDispatcherImpl.read0(java.base@11.0.17/Native Method)
    at sun.nio.ch.SocketDispatcher.read(java.base@11.0.17/SocketDispatcher.java:39)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(java.base@11.0.17/IOUtil.java:276)
    at sun.nio.ch.IOUtil.read(java.base@11.0.17/IOUtil.java:245)
    at sun.nio.ch.IOUtil.read(java.base@11.0.17/IOUtil.java:223)
    at sun.nio.ch.SocketChannelImpl.read(java.base@11.0.17/SocketChannelImpl.java:356)
    at org.apache.http.impl.nio.reactor.SessionInputBufferImpl.fill(SessionInputBufferImpl.java:231)
    at org.apache.http.impl.nio.codecs.AbstractContentDecoder.fillBufferFromChannel(AbstractContentDecoder.java:129)
    at org.apache.http.impl.nio.codecs.ChunkDecoder.read(ChunkDecoder.java:195)
    at org.apache.synapse.transport.passthru.Pipe.produce(Pipe.java:250)
    at org.apache.synapse.transport.passthru.TargetResponse.read(TargetResponse.java:164)
    at org.apache.synapse.transport.passthru.TargetHandler.inputReady(TargetHandler.java:606)
    at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
    at org.apache.synapse.transport.http.conn.LoggingNHttpClientConnection.consumeInput(LoggingNHttpClientConnection.java:115)
    at org.apache.synapse.transport.passthru.ClientIODispatch.onInputReady(ClientIODispatch.java:83)
    at org.apache.synapse.transport.passthru.ClientIODispatch.onInputReady(ClientIODispatch.java:41)
    at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
    at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
    at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
    at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
    at java.lang.Thread.run(java.base@11.0.17/Thread.java:829)
anupbvr commented 2 months ago

We looked into the source code of Chunk Decoder readHead line no 112 and found below....

if (this.buffer.length() > 2 || this.endOfStream) { throw new MalformedChunkCodingException("CRLF expected at end of chunk"); }....

Is this related the size of JSON response back end service respond?