ChainSafe / forest

🌲 Rust Filecoin Node Implementation
https://forest.chainsafe.io
Apache License 2.0
638 stars 156 forks source link

Implement Syncing #115

Closed dutterbutter closed 4 years ago

dutterbutter commented 4 years ago

The primary purpose of chain sync is to handle the retrieval and propagation of blocks and messages and state replication. I will open separate conjoining issues to reference relevant data structures (e.g. chain manager (#85) and block propagation).

Relevant chainsync files in existing implementations:

Go-Filecoin:

Lotus:

The chain sync protocol can be broken down into 5 steps according to the specification:

  1. Validate local data structures
  2. Bootstrap to the network
  3. Sync to specified checkpoint
  4. Update statetree, best targets, db during chain catchup
  5. Stay in sync and participate

[ 1 ] Involves loading local state, chain data structures (tipsets), and caches. These data structures need to pass local semantic and syntactic validations. Additionally, checking state tree and chain structures to the LatestCheckpoint.

[ 2 ] Connecting to the filecoin network by establishing a connection with existing peers. It is recommended to use multiple discovery protocols (e.g. BootstrapList, KademliaDHT or Gossipsub). Begin accepting/reviewing best tipset heads from the network via pubsub, in the specification, it indicates spinning up a Graphsync service that responds to other's queries, however, we can circumvent that until Graphsync can be completed.

Make a check to see if the blocks and statetree loaded from the previous steps (e.g. corresponding to LatestCheckpoint) matches move to either to sync (step 3) to that checkpoint or start updating relevant data (e.g. statetree, best targets, db) through chain catch up (step 4).

[ 3 ] A node begins syncing when its blocks and statetree does not correspond to the networks LatestCheckpoint. When this occurs we will send requests via Graphsync to its peers randomly for the correct blocks and statetree referenced by the LatestCheckpoint. Its likely this process will fail so multiple attempts is required. Further, we will need to fetch parents of specified blocks.

pub fn sync_checkpoint() { 
  // while statetree !== latest_checkpoint.statetree {
 //  use graphsync to pull ipldStore data from random peers
 //  make a check to local store to ensure data is not already stored
} 

 // sync complete move to Chain followup
}

[ 4 ] At this point the following processes have been established:

The important part of this stage is all blocks that are fetched must be stored, validated and linked locally (e.g. headers, messages). Throughout this process, bad tipset heads are removed from data storing structures.

Finally, all blocks between finality_tipset and target_heads have been validated and we are good to move forward to Chain Followup.

Validation of blocks may look like:

  1. Syntax (serialization, value ranges)
  2. Consensus rules (e.g. weight, epoch values)
  3. Block signatures
  4. ElectionPoSt is the correct winning ticket
  5. Chain ancestry lines up correctly
  6. Message signatures
  7. State tree - tipset messages execution produces the claimed roots for messages/receipts

[ 5 ] All previous states are still intact and running, here we need to keep track of block_gap. The block gap determines the number of epochs required to validate to have a safe best_target_head. Where epoch_gap is the number of epochs required.

In this stage, we need to set a threshold to inform the node if it falls behind that it needs to revert to previous steps (e.g. Chain Catchup - step 4). We can do so by setting max values to both block and epoch gaps.

If block_gap > max_block_gap {  back to step 4 }
If epoch_gap > max_epoch_gap {  back to step 4 }
ec2 commented 4 years ago

i love the way you write

amerameen commented 4 years ago

Closing this as syncing is done!