streamnative / bookkeeper-achieved

Apache Bookkeeper
https://bookkeeper.apache.org
Apache License 2.0
3 stars 2 forks source link

ISSUE-2806: OOM as 1 million ledgers per entry log #409

Open sijie opened 2 years ago

sijie commented 2 years ago

Original Issue: apache/bookkeeper#2806


BUG REPORT

Describe the bug There is about 1M ledgers in per entry log. After running for a while, OOM will appear. And there is still enough memory. There are two OOM positions, as follows: Position 1: 2021-09-21 02:22:08,323 [SyncThread-7-1] ERROR org.apache.bookkeeper.bookie.SyncThread - Exception in SyncThread java.lang.OutOfMemoryError: Java heap space at org.apache.bookkeeper.bookie.storage.ldb.WriteCache.forEach(WriteCache.java:222) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1]

Position 2: 2021-09-21 02:24:14,987 [SyncThread-7-1] ERROR org.apache.bookkeeper.bookie.SyncThread - Exception in SyncThread java.lang.OutOfMemoryError: Java heap space at org.apache.bookkeeper.util.collections.ConcurrentLongLongHashMap$Section.rehash(ConcurrentLongLongHashMap.java:673) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1] at org.apache.bookkeeper.util.collections.ConcurrentLongLongHashMap$Section.addAndGet(ConcurrentLongLongHashMap.java:456) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1] at org.apache.bookkeeper.util.collections.ConcurrentLongLongHashMap.addAndGet(ConcurrentLongLongHashMap.java:186) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1] at org.apache.bookkeeper.bookie.EntryLogMetadata.addLedgerSize(EntryLogMetadata.java:47) ~[org.apache.bookkeeper-bookkeeper-server-4.14.1.jar:4.14.1]

main GC log: 2021-09-21T02:22:08.437+0800: 105453.194: [GC pause (G1 Humongous Allocation) 2021-09-21T02:22:08.449+0800: 105453.206: [Full GC (Allocation Failure) 10G->10G(20G), 1.9652874 secs] [Eden: 0.0B(992.0M)->0.0B(1024.0M) Survivors: 32.0M->0.0B Heap: 10.3G(20.0G)->10.3G(20.0G)], [Metaspace: 35551K->35539K(1081344K)] [Times: user=4.94 sys=0.00, real=1.96 secs] 2021-09-21T02:22:10.415+0800: 105455.172: [Full GC (Allocation Failure) 10G->10G(20G), 1.6151095 secs] [Eden: 0.0B(1024.0M)->0.0B(1024.0M) Survivors: 0.0B->0.0B Heap: 10.3G(20.0G)->10.3G(20.0G)], [Metaspace: 35539K->35539K(1081344K)] [Times: user=4.32 sys=0.00, real=1.62 secs]

The common feature is that they are allocting a humongous contiguous memory. Position 1: In WriteCache.forEach, about 1M entry per minute, the sortedEntries size shoud be 1M42*8=64M byte.

Position 2: As 1M ledgers per entry log, the table size of ConcurrentLongLongHashMap should be 1M22*8=32M byte. Some times, ledger will be more than 1 M, so the memory shoud be larger than 32M.

As use G1 and the G1HeapRegionSize is 32m (the max value), there maybe no contiguous regions to allocate the humongous contiguous memory. Pre-allocated a large memory for the sortedEntries and add concurrencyLevel for the ConcurrentLongLongHashMap of EntryLogMetadata. The issue do not appear again. How about add 2 config for these?

As the meta data of all entry log is in the memory, the memory it occupied is very large. As 32MB EntryLogMetadata per entry log, the memroy shoul be serveral GB if there are hundreds of entry logs. I delete the ledgers by time. It will be delete after expire. If the entry log is not expire, it will not be delete. So the meta data is not need to load to memory. How about add a feature like this?

hangc0276 commented 2 years ago

I will address this issue