Open nholland94 opened 4 years ago
1 thing I noticed is that once we change the required_local_state_sync target to best_tip instead of root, if we propose before ledger catchup finishes, the required_local_state_sync check in proposer would fail and crash the node.
Possible solution would be prevent proposer from producing block if we are still doing catchup. This would also solve other kind of failures.
I think it would be unsafe to always prevent the proposer from producing a block during catchup, but I think we could make that a rule for the initial best tip ledger catchup we do after bootstrapping. Would be a little tricky since we would need to add the concept of a "required catchup" which would block the proposer while active.
Can you detail the case in which a proposer producing a block before ledger catchup completes and how it would cause the required_local_state_sync
check to fail? I don't see, off the top of my head, how that would cause the check to fail, but I'm probably just not seeing what you are.
Because bootstrap now use the peer's best_tip in required_local_state_sync
, while if catchup didn't finish, then our best_tip = root, and in proposer, we do required_local_state_sync
against our best_tip which is just the root.
We discussed this during the bootstrap meeting, we haven't recollected the exact edge case conditions but we need to stop crashing in the case @ghost-not-in-the-shell identified.
There is an edge case in checking local state sync requirements at the completion of bootstrap which we do not correctly cover right now. Below is my explanation of the edge case and how to address it, taken from a comment I left on #4027 (comment here https://github.com/CodaProtocol/coda/pull/4027#pullrequestreview-328519894).