streamnative / pulsar-archived

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org
Apache License 2.0
73 stars 25 forks source link

ISSUE-6894: Still having issues of Failed to restore rockdb #956

Open sijie opened 4 years ago

sijie commented 4 years ago

Original Issue: apache/pulsar#6894


Describe the bug we have accidentally addressed "Still having issues of Failed to restore rockdb" when we are running as standalone mode. and didn't changed any configurations for bookkeeper.

might related to this https://github.com/apache/pulsar/issues/5668

with -nss, looks fine now.

so, we should run with -nss until having some changes?

To Reproduce Logs 13:29:28.310 [io-write-scheduler-OrderedScheduler-0-0] WARN org.apache.bookkeeper.stream.storage.impl.sc.ZkStorageContainerManager - Failed to start storage container (0) java.util.concurrent.CompletionException: org.apache.bookkeeper.statelib.api.exceptions.StateStoreException: Failed to restore rocksdb 000000000000000000/000000000000000000/000000000000000000 at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_242] at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_242] at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:957) ~[?:1.8.0_242] at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940) ~[?:1.8.0_242] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_242] at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) ~[?:1.8.0_242] at org.apache.bookkeeper.statelib.impl.journal.AbstractStateStoreWithJournal.lambda$executeIO$16(AbstractStateStoreWithJournal.java:474) ~[org.apache.bookkeeper-statelib-4.10.0.jar:4.10.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_242] at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) [com.google.guava-guava-25.1-jre.jar:?] at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57) [com.google.guava-guava-25.1-jre.jar:?] at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) [com.google.guava-guava-25.1-jre.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_242] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_242] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_242] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.45.Final.jar:4.1.45.Final] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242] Caused by: org.apache.bookkeeper.statelib.api.exceptions.StateStoreException: Failed to restore rocksdb 000000000000000000/000000000000000000/000000000000000000 at org.apache.bookkeeper.statelib.impl.rocksdb.checkpoint.RocksCheckpointer.restore(RocksCheckpointer.java:84) ~[org.apache.bookkeeper-statelib-4.10.0.jar:4.10.0] at org.apache.bookkeeper.statelib.impl.kv.RocksdbKVStore.loadRocksdbFromCheckpointStore(RocksdbKVStore.java:161) ~[org.apache.bookkeeper-statelib-4.10.0.jar:4.10.0] at org.apache.bookkeeper.statelib.impl.kv.RocksdbKVStore.init(RocksdbKVStore.java:223) ~[org.apache.bookkeeper-statelib-4.10.0.jar:4.10.0] at org.apache.bookkeeper.statelib.impl.journal.AbstractStateStoreWithJournal.lambda$initializeLocalStore$5(AbstractStateStoreWithJournal.java:202) ~[org.apache.bookkeeper-statelib-4.10.0.jar:4.10.0] at org.apache.bookkeeper.statelib.impl.journal.AbstractStateStoreWithJournal.lambda$executeIO$16(AbstractStateStoreWithJournal.java:471) ~[org.apache.bookkeeper-statelib-4.10.0.jar:4.10.0] ... 12 more Caused by: org.apache.distributedlog.exceptions.LogEmptyException: Log 000000000000000000/000000000000000000/000000000000000000/checkpoints/e6ac48ab-1045-472e-89d0-95686a71ee8d/metadata: has no records at org.apache.distributedlog.BKLogHandler$2$1.onSuccess(BKLogHandler.java:245) ~[org.apache.distributedlog-distributedlog-core-4.10.0.jar:4.10.0] at org.apache.distributedlog.BKLogHandler$2$1.onSuccess(BKLogHandler.java:239) ~[org.apache.distributedlog-distributedlog-core-4.10.0.jar:4.10.0] at org.apache.bookkeeper.common.concurrent.FutureEventListener.accept(FutureEventListener.java:42) ~[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0] at org.apache.bookkeeper.common.concurrent.FutureEventListener.accept(FutureEventListener.java:26) ~[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0] at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) ~[?:1.8.0_242] at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) ~[?:1.8.0_242] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_242] at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) ~[?:1.8.0_242] at org.apache.distributedlog.BKLogHandler.readLogSegmentsFromStore(BKLogHandler.java:636) ~[org.apache.distributedlog-distributedlog-core-4.10.0.jar:4.10.0] at org.apache.distributedlog.BKLogHandler$6.onSuccess(BKLogHandler.java:600) ~[org.apache.distributedlog-distributedlog-core-4.10.0.jar:4.10.0] at org.apache.distributedlog.BKLogHandler$6.onSuccess(BKLogHandler.java:592) ~[org.apache.distributedlog-distributedlog-core-4.10.0.jar:4.10.0] at org.apache.bookkeeper.common.concurrent.FutureEventListener.accept(FutureEventListener.java:42) ~[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0] at org.apache.bookkeeper.common.concurrent.FutureEventListener.accept(FutureEventListener.java:26) ~[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0] at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) ~[?:1.8.0_242] at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) ~[?:1.8.0_242] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_242] at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) ~[?:1.8.0_242] at org.apache.distributedlog.impl.ZKLogSegmentMetadataStore.processResult(ZKLogSegmentMetadataStore.java:377) ~[org.apache.distributedlog-distributedlog-core-4.10.0.jar:4.10.0] at org.apache.bookkeeper.zookeeper.ZooKeeperClient$25$1.processResult(ZooKeeperClient.java:1174) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0] at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:627) ~[org.apache.pulsar-pulsar-zookeeper-2.5.1.jar:2.5.1] at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) ~[org.apache.pulsar-pulsar-zookeeper-2.5.1.jar:2.5.1]

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context this issue is on Pulsar 2.5.1

nicolo-paganin commented 4 years ago

I still have this error in pulsar 2.6.0, any news?