Closed TheMMaciek closed 3 years ago
I think we don't need it currently - such issues don't occur on mainnet frequently - and also this issue in my opinion should be rather addressed in the place where it origins e.i. in majorityStateChooser. As a fix for this issue we would prevent majorityStateChooser from picking majority "for heights above the gap"/"when majority for previous height wasn't picked". Given these observations I'm closing this PR.
If we want to prevent having gaps in the buckets we could introduce such check. Essentially it checks if previous snapshot was sent completely before sending the next one. In case the permanent gap would happen the node will stop sending snapshots to the bucket at all. Based on the introduced metrics we could create an alert. And in case the snapshots sent to cloud metric stopped progressing, while the network is progressing it would be a sign to perform a rollback.