vmware / concord-bft

Concord byzantine fault tolerant state machine replication library
377 stars 147 forks source link

[WIP] fix db checkpoint async bug #2998

Closed yontyon closed 1 year ago

yontyon commented 1 year ago

In this PR we propose a wide change that fixes the problem. Till now, the decision of whether to create a db checkpoint was the primary only. If the primary decided that it's time to create the db checkpoint, it sends a bft command whose execution is, creating a DB checkpoint. (Note that this approach, regardless of the above bugs, is not safe in terms of DOS attacks, a malicious primary can order the replicas to continuously create db checkpoints). Here we propose a different solution: the decision to create a db checkpoint is based on a deterministic local event (such as how much time has passed since the last created db checkpoint). This way, once decided a db checkpoint creation callback is registered to the stable sequence number event. Once the replica reaches this stable sequence number, it starts to create the db checkpoint asynchronously, but now the block number is aligned with the sequence number because it was taken right after the sequence number execution. To make the above feasible, (1) we cannot rely on local timeouts (instead, we consider only the time being received by consensus), (2) the db checkpoint metadata (such as sequence number and timestamp) has to be shared between all replicas (via reserved pages).