apache / pulsar

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org/
Apache License 2.0
14.23k stars 3.58k forks source link

[2.7.4] NPE while open reader to pulsar topic #13713

Open tonyvelichko opened 2 years ago

tonyvelichko commented 2 years ago

Describe the bug Opening the reader to the topic produces NPE, and causes the exception on the client-side Exclusive consumer is already connected

16:08:18.406 [pulsar-stats-updater-25-1] ERROR org.apache.pulsar.broker.service.PulsarStats - Failed to generate namespace stats for namespace {namespace}: null
java.lang.NullPointerException: null
    at org.apache.bookkeeper.mledger.impl.ManagedCursorContainer.removeCursor(ManagedCursorContainer.java:128) ~[org.apache.pulsar-managed-ledger-2.7.4.jar:2.7.4]
    at org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.deactivateCursor(ManagedLedgerImpl.java:3122) ~[org.apache.pulsar-managed-ledger-2.7.4.jar:2.7.4]
    at org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.setInactive(ManagedCursorImpl.java:956) ~[org.apache.pulsar-managed-ledger-2.7.4.jar:2.7.4]
    at org.apache.pulsar.broker.service.persistent.PersistentTopic.lambda$checkBackloggedCursors$74(PersistentTopic.java:2063) ~[org.apache.pulsar-pulsar-broker-2.7.4.jar:2.7.4]
    at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:387) ~[org.apache.pulsar-pulsar-common-2.7.4.jar:2.7.4]
    at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:159) ~[org.apache.pulsar-pulsar-common-2.7.4.jar:2.7.4]
    at org.apache.pulsar.broker.service.persistent.PersistentTopic.checkBackloggedCursors(PersistentTopic.java:2058) ~[org.apache.pulsar-pulsar-broker-2.7.4.jar:2.7.4]
    at org.apache.pulsar.broker.service.PulsarStats.lambda$null$1(PulsarStats.java:141) ~[org.apache.pulsar-pulsar-broker-2.7.4.jar:2.7.4]
    at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:387) ~[org.apache.pulsar-pulsar-common-2.7.4.jar:2.7.4]
    at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:159) ~[org.apache.pulsar-pulsar-common-2.7.4.jar:2.7.4]
    at org.apache.pulsar.broker.service.PulsarStats.lambda$null$3(PulsarStats.java:131) ~[org.apache.pulsar-pulsar-broker-2.7.4.jar:2.7.4]
    at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:387) ~[org.apache.pulsar-pulsar-common-2.7.4.jar:2.7.4]
    at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:159) ~[org.apache.pulsar-pulsar-common-2.7.4.jar:2.7.4]
    at org.apache.pulsar.broker.service.PulsarStats.lambda$updateStats$4(PulsarStats.java:120) ~[org.apache.pulsar-pulsar-broker-2.7.4.jar:2.7.4]
    at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:387) ~[org.apache.pulsar-pulsar-common-2.7.4.jar:2.7.4]
    at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:159) ~[org.apache.pulsar-pulsar-common-2.7.4.jar:2.7.4]
    at org.apache.pulsar.broker.service.PulsarStats.updateStats(PulsarStats.java:110) ~[org.apache.pulsar-pulsar-broker-2.7.4.jar:2.7.4]
    at org.apache.pulsar.broker.service.BrokerService.updateRates(BrokerService.java:1370) ~[org.apache.pulsar-pulsar-broker-2.7.4.jar:2.7.4]
    at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [org.apache.pulsar-managed-ledger-2.7.4.jar:2.7.4]
    at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.12.0.jar:4.12.0]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_312]
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_312]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_312]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_312]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_312]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_312]
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.68.Final.jar:4.1.68.Final]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_312]

Additional context The same workload ran on 2.7.1 had no errors.

gaozhangmin commented 2 years ago

12297 had fixed it.

tonyvelichko commented 2 years ago

Thanks @gaozhangmin I've missed that!

But still, I think the NPE is just a result, cause the Bookie's full of errors [about 280 unique ledgers] about the ledger is missing. No ledger found while performing readLac from ledger: 3420

gaozhangmin commented 2 years ago

Thanks @gaozhangmin I've missed that!

But still, I think the NPE is just a result, cause the Bookie's full of errors [about 280 unique ledgers] about the ledger is missing. No ledger found while performing readLac from ledger: 3420

The los bookies should be less than managedLedgerDefaultAckQuorum. Or, the error will be thrown

github-actions[bot] commented 2 years ago

The issue had no activity for 30 days, mark with Stale label.

github-actions[bot] commented 2 years ago

The issue had no activity for 30 days, mark with Stale label.