dtr-org / unit-e

A digital currency for a new era of decentralized trust
https://unit-e.io
MIT License
45 stars 15 forks source link

Node cannot restart in testnet #955

Closed frolosofsky closed 5 years ago

frolosofsky commented 5 years ago

Describe the bug After finishing initial sync node cannot restart properly.

2019-04-11 19:45:09 [finalization] Restoring state repository from disk, last_finalized_epoch=37
2019-04-11 19:45:10 [finalization] Loaded 70 states
2019-04-11 19:45:10 [            ] WARN: State for block=6da9034a51be17adc6a0444470ae1972fc3082f7acb724cc30bacf538b57d366 height=1906 missed in the finalization state database.
2019-04-11 19:45:10 [            ] Trying to recover the following states from block index database or block files: [834dba65340114ed1ca1bca216607a6a8281057fc27b034dd018fb763b46fcf0, 44d0b25512cc04a739406c5d14b16440a487eaba7719edb38bb03900fcd88f1e, 228247d75e0e6ae9b053384ee5b9c5a044bec83e229ef1eb5fed74434597663f, a2b28fe6f5d4f52d6b5703741deedcb16f29ae0e7bba5dd3d1b09940185a0a69, 6da9034a51be17adc6a0444470ae1972fc3082f7acb724cc30bacf538b57d366]
2019-04-11 19:45:10 [            ] ERROR: ReadBlockFromDisk: OpenBlockFile failed for CBlockDiskPos(nFile=-1, nPos=0)
2019-04-11 19:45:10 [            ] Cannot read block=834dba65340114ed1ca1bca216607a6a8281057fc27b034dd018fb763b46fcf0 to restore finalization state for block=6da9034a51be17adc6a0444470ae1972fc3082f7acb724cc30bacf538b57d366.
2019-04-11 19:45:10 [            ] Need sync
2019-04-11 19:45:10 [            ] Cannot load block=834dba65340114ed1ca1bca216607a6a8281057fc27b034dd018fb763b46fcf0
2019-04-11 19:45:10 [            ] : Error opening block database.
Please restart with -reindex or -reindex-chainstate to recover.
: Error opening block database.
Please restart with -reindex or -reindex-chainstate to recover.

To Reproduce

  1. Lunch the node, like src/unit-e -printtoconsole -debug=all -datadir=./somewhere.
  2. Wait initial sync complete.
  3. Stop node.
  4. Start node.

Expected behavior Node is expected to start successfully.

Environment unit-e 7230215b25da7e61371dfa1cde571fac91d56033

Thoughts

I forced node to download all finalization states, the result is pretty the same (take a look on overall loaded states, the actual chain height was about ~2k):

2019-04-11 19:49:03 [finalization] Restore state repository from disk, Load all states.
2019-04-11 19:49:03 [            ] leveldb: Generated table #14@0: 29453 keys, 1473689 bytes
2019-04-11 19:49:03 [            ] leveldb: Compacted 4@0 + 0@1 files => 1473689 bytes
2019-04-11 19:49:03 [finalization] Loaded 71 states
2019-04-11 19:49:03 [            ] WARN: State for block=6da9034a51be17adc6a0444470ae1972fc3082f7acb724cc30bacf538b57d366 height=1906 missed in the finalization state database.
2019-04-11 19:49:03 [            ] Trying to recover the following states from block index database or block files: [834dba65340114ed1ca1bca216607a6a8281057fc27b034dd018fb763b46fcf0, 44d0b25512cc04a739406c5d14b16440a487eaba7719edb38bb03900fcd88f1e, 228247d75e0e6ae9b053384ee5b9c5a044bec83e229ef1eb5fed74434597663f, a2b28fe6f5d4f52d6b5703741deedcb16f29ae0e7bba5dd3d1b09940185a0a69, 6da9034a51be17adc6a0444470ae1972fc3082f7acb724cc30bacf538b57d366]
2019-04-11 19:49:03 [            ] ERROR: ReadBlockFromDisk: OpenBlockFile failed for CBlockDiskPos(nFile=-1, nPos=0)
2019-04-11 19:49:03 [            ] Cannot read block=834dba65340114ed1ca1bca216607a6a8281057fc27b034dd018fb763b46fcf0 to restore finalization state for block=6da9034a51be17adc6a0444470ae1972fc3082f7acb724cc30bacf538b57d366.
2019-04-11 19:49:03 [            ] Need sync
2019-04-11 19:49:03 [            ] Cannot load block=834dba65340114ed1ca1bca216607a6a8281057fc27b034dd018fb763b46fcf0
2019-04-11 19:49:03 [            ] : Error opening block database.
Please restart with -reindex or -reindex-chainstate to recover.
: Error opening block database.
Please restart with -reindex or -reindex-chainstate to recover.

There're several direction in investigation which could be extracted in individual issues/PRs if needed.

  1. Why there are only 71 states available?
  2. Why this data is not enough to restore finalization states.
  3. Block 834dba65340114ed1ca1bca216607a6a8281057fc27b034dd018fb763b46fcf0. Why state is not restored from its CBlockIndex commits? Why we cannot load it from disk?
  4. Try to reproduce in functional test.
frolosofsky commented 5 years ago

Some explanations:

Why there are only 71 states available?

Because during initial block download, node for any mysterious reason, decided to not flush data to the disk. It is not a big issue actually unless node crashes during IBD. And node does flush states in the regtest environment.

Why this data is not enough to restore finalization states.

Block 834dba65340114ed1ca1bca216607a6a8281057fc27b034dd018fb763b46fcf0. Why state is not restored from its CBlockIndex commits? Why we cannot load it from disk?

The data was enough to restore finalization states of active and parallel chains, but it also tried to restore states for block indexes which has no data (we got them via headers).

Try to reproduce in functional test.

Reproduced in unit tests.

959.