Problem Overview
The current implementation of the db checkpoint feature has a synchronization bug:
While we take the db checkpoint in the background, we don't align anything with the checkpoint sequence number, i.e. the block number, bft metadata, pending reserved pages, and more.
Once we put more client requests in a different resolution than 150 we start to see a wide set of issues:
For example:
The recovered replica won't start from a stable checkpoint, instead, it starts from the point where the db checkpoint was taken.
In a case where the checkpoint was taken in the middle of another execution phase, we won't have the pending reserved pages to recover correctly.
We trim the block at the point where the db checkpoint was taken, but we don't update the bft metadata accordingly.
Below is an example of part of these issues:
On replica 0, block 302 was created on sequence number 304
This PR proposes a fix, in which, we pin the bft sequence number before starting the async part, and align everything accordingly.
This PR doesn't handle the case of explicitly creating db checkpoint by the operator, as it assumes to be used for clients only (which cares only about the blockchain)
Testing Done
CI + Changing an existing test to verify the changes
This PR proposes a fix, in which, we pin the bft sequence number before starting the async part, and align everything accordingly.
Can you detail what data you persist differently from before and why?
The current implementation of the db checkpoint feature has a synchronization bug: While we take the db checkpoint in the background, we don't align anything with the checkpoint sequence number, i.e. the block number, bft metadata, pending reserved pages, and more. Once we put more client requests in a different resolution than 150 we start to see a wide set of issues: For example:
However, on recovery, the recovered replica has the same block was created on sequence number 305:
This PR proposes a fix, in which, we pin the bft sequence number before starting the async part, and align everything accordingly.
This PR doesn't handle the case of explicitly creating db checkpoint by the operator, as it assumes to be used for clients only (which cares only about the blockchain)
CI + Changing an existing test to verify the changes