bookie crash leaves checkpoint in a corrupt state (restore fails because of LogEmptyException: Log .../checkpoints/.../metadata: has no records)
13:28:16.419 [io-write-scheduler-OrderedScheduler-1-0] INFO org.apache.bookkeeper.stream.storage.impl.store.MVCCStoreFactoryImpl - Clearing resources hold by stream(1)/range(0) at storage container (6)
13:28:16.421 [io-write-scheduler-OrderedScheduler-1-0] WARN org.apache.bookkeeper.stream.storage.impl.sc.StorageContainerRegistryImpl - De-registered StorageContainer ('6') when failed to start
java.util.concurrent.CompletionException: org.apache.bookkeeper.statelib.api.exceptions.StateStoreException: Failed to restore rocksdb 000000000000000006/000000000000000001/000000000000000000
...
at org.apache.bookkeeper.statelib.impl.journal.AbstractStateStoreWithJournal.lambda$executeIO$16(AbstractStateStoreWithJournal.java:474) ~[org.apache.bookkeeper-statelib-4.9.3-SNAPSHOT.jar:4.9.3-SNAPSHOT]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_222]
...
Caused by: org.apache.bookkeeper.statelib.api.exceptions.StateStoreException: Failed to restore rocksdb 000000000000000006/000000000000000001/000000000000000000
at org.apache.bookkeeper.statelib.impl.rocksdb.checkpoint.RocksCheckpointer.restore(RocksCheckpointer.java:84) ~[org.apache.bookkeeper-statelib-4.9.3-SNAPSHOT.jar:4.9.3-SNAPSHOT]
at org.apache.bookkeeper.statelib.impl.kv.RocksdbKVStore.loadRocksdbFromCheckpointStore(RocksdbKVStore.java:161) ~[org.apache.bookkeeper-statelib-4.9.3-SNAPSHOT.jar:4.9.3-SNAPSHOT]
at org.apache.bookkeeper.statelib.impl.kv.RocksdbKVStore.init(RocksdbKVStore.java:223) ~[org.apache.bookkeeper-statelib-4.9.3-SNAPSHOT.jar:4.9.3-SNAPSHOT]
at org.apache.bookkeeper.statelib.impl.journal.AbstractStateStoreWithJournal.lambda$initializeLocalStore$5(AbstractStateStoreWithJournal.java:202) ~[org.apache.bookkeeper-statelib-4.9.3-SNAPSHOT.jar:4.9.3-SNAPSHOT]
at org.apache.bookkeeper.statelib.impl.journal.AbstractStateStoreWithJournal.lambda$executeIO$16(AbstractStateStoreWithJournal.java:471) ~[org.apache.bookkeeper-statelib-4.9.3-SNAPSHOT.jar:4.9.3-SNAPSHOT]
...
To Reproduce
Bookie has to crash at any point after the checkpoint is done with createDirectories() but before the finalizeCheckpoint() happened in the RocksdbCheckpointTask.checkpoint().
See unit test in PR
Expected behavior
Table service should be able to recover from previous checkpoint + journal
Screenshots
If applicable, add screenshots to help explain your problem.
Original Issue: apache/bookkeeper#2565
BUG REPORT
Describe the bug
bookie crash leaves checkpoint in a corrupt state (restore fails because of LogEmptyException: Log .../checkpoints/.../metadata: has no records)
To Reproduce
Bookie has to crash at any point after the checkpoint is done with
createDirectories()
but before thefinalizeCheckpoint()
happened in theRocksdbCheckpointTask.checkpoint()
.See unit test in PR
Expected behavior
Table service should be able to recover from previous checkpoint + journal Screenshots
If applicable, add screenshots to help explain your problem.