Closed stringhandler closed 2 months ago
Sounds good. Except that I would say they should download the checkpoint at the beginning of the current epoch (or potentially at the last mini checkpoint location in the current epoch) and then replay all blocks since then, up to the current tip.
Implemented in #1067
Problem
When a registered validator joins the network for the first time, there is a high bandwidth and processing cost which worsens as the network progresses.
Currently, this includes: A. Syncing the whole state of the shard. Including DOWN substates. B. Syncing all historical transactions for the shard/s C. Syncing the entire block history/chain of one or more shards
(A) is unavoidable and should be optimised to reduce time and bandwidth costs.
(B) Should not be necessary at all. The main reason it is there is to prevent duplicate transactions from being processed. There also may be some code that expects the transaction to be available (e.g. in web UIs). However, duplicate transactions are already prevented by the TransactionReceipt substate. A validator node may want to generate an index of these as it syncs to optimise duplicate transaction checking before it fails in the commit phase. Some archival nodes may want to track and store historical transactions, but these are separate concerns and consensus/block producer nodes would generally not do this.
(C) This requires more thought. It is important that new validator node knows that they have the complete and agreed shard state at or close to the end of the previous epoch. Beyond that, knowing which transactions were historically processed in which block by whom is not necessary to proceed with consensus in a new epoch.
Proposal 1
Start a new chain for each new epoch and verify a checkpoint proof.
For a validator to join a shard for an epoch, they request a succinct proof that asserts that two thirds of the validator set for the shard-epoch have committed a given state.
To achieve this the following is done:
Checkpoint Proof
The exact construction of the proof needs to be thought about more
A shard-epoch checkpoint proof for the current commit block can be generated at any point by a participating validator node. The proof contains the jellyfish Merkle root of the state as well as the last 3 linked QCs of the block that is represented in the proof.
For example, the current tip block is 300, the proof contains the linked QCs of 299, 298, and 297. This proves that block 297 was committed by all non-faulty nodes.