Open jagerman opened 5 years ago
One other thing I notice is that a checkpoint goes out for the current top block immediately. This seems too soon:
- BBCBB
`BBBBCBB
`BBBBCBB
`BBBBCBB
` ... etc.
If checkpoint is a consensus component so that every node ranks (valid) chains lexicographically according to:
then this attack would not work: nodes would always prefer a chain with one checkpoint to one with zero, even if the one with zero has more work.
If a reorg arrives which is checkpointed but the current chain is relative long and not checkpointed some really strange behaviour happens that results in a failure for nodes to switch to the checkpointed chain.
Last night the testnet was mining an alt chain that was separate from the chain that all the service nodes were on. (This was caused by the difficulty bug, which doesn't seem to be solved!). So the situation was:
The SN continually communicated this checkpoint to nodes, but they apparently ignored it because:
At this point I started mining on the SN chain and restarted a node that was on the non-SN chain. It immediately reorg'ed from the non-SN chain to the SN chain:
and because this was a large chain, had to do a full SN rescan:
then the weird stuff happens:
Those errors do not look innocuous. But then it continues to REORG itself back to the non-SN chain, for some reason doing a one block reorg for each block:
which continues all the way up to:
Then after the recalculation we get some other WARNINGs (which I am pointing out here because they might be related to the difficulty issue but I have not investigated):
and then we sync back to the alt chain again:
This same process keeps repeating over and over forever.
The only thing that stopped this infinite loop was that the cumulative difficulty on the SN chain eventually overtook the cumulative difficulty on the non-SN chain at which point everything reorged to the checkpointed SN chain.
This all suggests some problems in the checkpointing implementation that need to be addressed: