streamnative / bookkeeper-achieved

Apache Bookkeeper
https://bookkeeper.apache.org
Apache License 2.0
3 stars 2 forks source link

ISSUE-2820: Journal thread fetch entry from queue cost too much time on acquireInterruptibly #412

Open sijie opened 2 years ago

sijie commented 2 years ago

Original Issue: apache/bookkeeper#2820


BUG REPORT

Describe the bug The bookie configuration list as follow.

The write throughput keeps on 500MB/s. When read request increase and the read throughput reaches about 600MB/s, the journal queue will be full for a long time. I check the journal sync latency is low. I print the stack of the bookie, and found the BookieJournal-3181 thread keeps on acquireInterruptibly in ArrayBlockingQueue.poll operation.

"BookieJournal-3181" #25 prio=5 os_prio=0 tid=0x00007f89b8001000 nid=0x14e waiting on condition [0x00007f8daebee000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00007f948dfb55c8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireInterruptibly(AbstractQueuedSynchronizer.java:897)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1222)
    at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
    at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:413)
    at org.apache.bookkeeper.bookie.Journal.run(Journal.java:980)

   Locked ownable synchronizers:
    - None

And the bookie-io threads are keeps on ArrayBlockingQueue.put operation

"bookie-io-1-2" #105 prio=5 os_prio=0 tid=0x00007f8fd0007800 nid=0x155 waiting on condition [0x00007f8dac3cd000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00007f948dfb55a0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:353)
    at org.apache.bookkeeper.bookie.Journal.logAddEntry(Journal.java:868)
    at org.apache.bookkeeper.bookie.Journal.logAddEntry(Journal.java:857)
    at org.apache.bookkeeper.bookie.Bookie.addEntryInternal(Bookie.java:1297)
    at org.apache.bookkeeper.bookie.Bookie.addEntry(Bookie.java:1402)
    - locked <0x00007f96c39c1a18> (a org.apache.bookkeeper.bookie.LedgerDescriptorImpl)
    at org.apache.bookkeeper.proto.WriteEntryProcessor.processPacket(WriteEntryProcessor.java:80)
    at org.apache.bookkeeper.proto.PacketProcessorBase.safeRun(PacketProcessorBase.java:85)
    at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36)
    at org.apache.bookkeeper.proto.BookieRequestProcessor.processAddRequest(BookieRequestProcessor.java:625)
    at org.apache.bookkeeper.proto.BookieRequestProcessor.processRequest(BookieRequestProcessor.java:365)
    at org.apache.bookkeeper.proto.BookieRequestHandler.channelRead(BookieRequestHandler.java:80)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93)
    at org.apache.bookkeeper.proto.AuthHandler$ServerSideHandler.channelRead(AuthHandler.java:95)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at org.apache.bookkeeper.proto.BookieProtoEncoding$RequestDecoder.channelRead(BookieProtoEncoding.java:450)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
    at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:311)
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:432)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795)
    at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480)
    at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.lang.Thread.run(Thread.java:748)

Does anyone has any ideas for this situation?

The jstack list as follow jstack.log