apache / pulsar

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org/
Apache License 2.0
14.1k stars 3.56k forks source link

[bookie-io-1-9] ERROR org.apache.bookkeeper.proto.BookieServer - Unable to allocate memory, exiting bookie #14468

Open armandxu opened 2 years ago

armandxu commented 2 years ago

Describe the bug when storage is full,then bookie has an error:

2022-02-24T23:13:39,590+0800 [bookie-io-1-9] ERROR org.apache.bookkeeper.proto.BookieServer - Unable to allocate memory, exiting bookie io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 23605542941, max: 23622320128) at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:802) ~[io.netty-netty-common-4.1.72.Final.jar:4.1.72.Final] at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:731) ~[io.netty-netty-common-4.1.72.Final.jar:4.1.72.Final] at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:648) ~[io.netty-netty-buffer-4.1.72.Final.jar:4.1.72.Final] at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:623) ~[io.netty-netty-buffer-4.1.72.Final.jar:4.1.72.Final] at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:202) ~[io.netty-netty-buffer-4.1.72.Final.jar:4.1.72.Final] at io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:186) ~[io.netty-netty-buffer-4.1.72.Final.jar:4.1.72.Final] at io.netty.buffer.PoolArena.allocate(PoolArena.java:136) ~[io.netty-netty-buffer-4.1.72.Final.jar:4.1.72.Final] at io.netty.buffer.PoolArena.allocate(PoolArena.java:126) ~[io.netty-netty-buffer-4.1.72.Final.jar:4.1.72.Final] at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:395) ~[io.netty-netty-buffer-4.1.72.Final.jar:4.1.72.Final] at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188) ~[io.netty-netty-buffer-4.1.72.Final.jar:4.1.72.Final] at org.apache.bookkeeper.common.allocator.impl.ByteBufAllocatorImpl.newDirectBuffer(ByteBufAllocatorImpl.java:163) [org.apache.bookkeeper-bookkeeper-common-allocator-4.14.2.jar:4.14.2] at org.apache.bookkeeper.common.allocator.impl.ByteBufAllocatorImpl.newDirectBuffer(ByteBufAllocatorImpl.java:157) [org.apache.bookkeeper-bookkeeper-common-allocator-4.14.2.jar:4.14.2] at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188) [io.netty-netty-buffer-4.1.72.Final.jar:4.1.72.Final] at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179) [io.netty-netty-buffer-4.1.72.Final.jar:4.1.72.Final] at io.netty.channel.unix.PreferredDirectByteBufAllocator.ioBuffer(PreferredDirectByteBufAllocator.java:53) [io.netty-netty-transport-native-unix-common-4.1.72.Final-linux-x86_64.jar:4.1.72.Final] at io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:120) [io.netty-netty-transport-4.1.72.Final.jar:4.1.72.Final] at io.netty.channel.epoll.EpollRecvByteAllocatorHandle.allocate(EpollRecvByteAllocatorHandle.java:75) [io.netty-netty-transport-classes-epoll-4.1.72.Final.jar:4.1.72.Final] at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:780) [io.netty-netty-transport-classes-epoll-4.1.72.Final.jar:4.1.72.Final] at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$1.run(AbstractEpollChannel.java:425) [io.netty-netty-transport-classes-epoll-4.1.72.Final.jar:4.1.72.Final] at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [io.netty-netty-common-4.1.72.Final.jar:4.1.72.Final] at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469) [io.netty-netty-common-4.1.72.Final.jar:4.1.72.Final] at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384) [io.netty-netty-transport-classes-epoll-4.1.72.Final.jar:4.1.72.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [io.netty-netty-common-4.1.72.Final.jar:4.1.72.Final] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [io.netty-netty-common-4.1.72.Final.jar:4.1.72.Final] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.72.Final.jar:4.1.72.Final] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]

jianke221 commented 2 years ago

armandxu @armandxu I are colleagues,Let me describe the phenomenon observed in the monitoring,We now test and find that directmemory will soar after the bookie node becomes readonly. After the ledger is expired and deleted, directmemory will stop soaring, but will not be released. After the next readonly, it will continue to soar. If it reaches the directmemory capacity before expiration and deletion, it will report an error and exit. What should we do in this case and what is the reason for the directmemory soaring d4a951477e5b13ef559a857ee374f3d

codelipenghui commented 2 years ago

@hangc0276 @zymap Could you please help check this issue?

hangc0276 commented 2 years ago

@jianke221 @armandxu Does this case can be reproduced? Would you please share the steps and status of the BookKeeper cluster that you encountered this issue.

zymap commented 2 years ago

Also I want to know more configuration about the cluster. such as the ensemble size , write quorum, ack quorum

armandxu commented 2 years ago

sorry for reply later,and ensemble size =2, write quorum=2, ack quorum=2,I can not reproduced this case,because i has new problem: broker can not found bookie,but bookie is ready: WechatIMG69 WechatIMG68

github-actions[bot] commented 2 years ago

The issue had no activity for 30 days, mark with Stale label.

github-actions[bot] commented 2 years ago

The issue had no activity for 30 days, mark with Stale label.

jimmycxm commented 2 years ago

@jianke221 Have you resolved the issue? I also got same issue.