Add experimental `--Xsnapsync-bft-enabled` which enables snap sync for BFT chains

matthew1001 commented 1 month ago

PR description

This PR adds various changes/fixes that allow snap-sync and BFT to be used together.

Since this is a relatively new combination of configuration I have added an experimental flag --Xsnapsync-bft-enabled which must be set in order for Besu to start.

My general opinion for enterprise chains is that it is probably reasonable for a BFT chain to have sync-min-peers set to 1. The trust requirements of a public chain are not as applicable to an enterprise chain, so the need to have e.g. 5 peers before you are willing to consider their pivot-block data is too restrictive. At the very least it makes it impossible to have a chain of 4 validators, or a chain of 6 validators with one unavailable. I've broadly outlined below the snap sync logic that this PR uses, some of which assumes that an enterprise user will set --sync-min-peers=1 if their chain only has a small number of nodes. I think there's an argument for the --profile=enterprise defaults to set --sync-min-peers=1, but I'm going to save that for a separate PR.

The basic logic used for sync is as follows:

If a node has a best peer (i.e. they have found a node with pivot data) but the peer's current block height < --Xsynchronizer-fast-sync-pivot-distance (50 by default) it skips any attempt at snap sync. Quitting the snap sync process allows the node to start contributing to BFT voting which ensures the chain isn't stalled waiting for this node. (The state for the most recent 50 blocks is obtained using full sync, which is existing behaviour)
If a node enters snap sync but believes itself to be the only validator it exits snap sync. As above, this ensures that the node can then start its BFT mining coordinator and continue producing new blocks, rather than get stuck in a loop of failed snap sync attempts which it will probably never get out of. This is particularly useful for the case where a single BFT validator node is started as a new chain, to which other validators will be added later on. Without this logic it would wait for at least 1 other peer, which may never come along.
If a node has sync-min-peers peers who are validators, but none of them has any usable pivot data, it quits the snap sync process, again to allow its BFT mining coordinator to start producing new blocks. This is necessary in a case where a number of new validators have been created for a brand new chain, but because they are all new and have no state to share they would get stuck in a loop of failed snap sync attempts.

(It's also worth noting that I've continued the process of renaming the fast-sync internal variables to be more generic as they also apply to snap and checkpoint sync)

(Note: see related PR https://github.com/hyperledger/besu/pull/7204 which fixes a follow on issue)

Documentation

The broad topic of sync-ing is covered in the public section of the docs here: https://besu.hyperledger.org/development/public-networks/get-started/connect/sync-node

I think the private section of the docs would do with its own sync-ing topic, which refers to the article above for the basic "how do the different sync modes work" but explains the permissioned-chain specific behaviours that I've implemented in this PR. I think the explanation in the description above is probably a reasonable starting point to write those docs, but I'm happy to review and edit the new topic once an initial draft is there.

non-fungible-nelson commented 1 month ago

Please test these changes with a Mainnet or Holesky / sepolia node to look for unintended side-effects as called out by Karim. Or insane logging or strange behavior. A Holesky snap sync should complete in a few hours and be considered an acceptance test for this PR.

matthew1001 commented 1 month ago

Getting back into this PR now. I have some refactoring planned but pleased to see that the general direction seems to be OK.

@matkt I think there are benefits of snap sync to permissioned chains in the same way as public chains. Adding a new node (validator or otherwise) to a permissioned chain will take a long time to sync if the chain has been running for a while.

I'll look at your comments on the log changes I made. I found it clearer to see what was going on and to match that to the code if the log entries were more specific to the class they were coming from.

non-fungible-nelson commented 1 month ago

Completed a Holesky snap-sync with no unintended consequences (with the flag set). Some logging noise for cleanup.

matthew1001 commented 1 month ago

Firstly many thanks to @matkt for the help getting the snap-server issues fixed when there is an empty chain with 0 accounts, and for all the guidance on the codebase in that snap-sync area. Really enjoyed the collaboration on it.

@garyschulte I think the PR is ready for re-review now that we've got to the bottom of issues with 0 account chains among other things.

@non-fungible-nelson can you confirm if your sync with holesky was successful and that the PR hasn't regressed anything in that area?

matthew1001 commented 3 weeks ago

Completed a Holesky snap-sync with no unintended consequences (with the flag set). Some logging noise for cleanup.

I've reverted the info logs to debug logs in places that @matkt commented on, given that you said there was some logging noise.

non-fungible-nelson commented 3 weeks ago

Completed another holesky sync as of commit 31c0cd3 and it went great.

matthew1001 commented 3 weeks ago

@garyschulte @matkt think I've tidied up remaining loose ends:

[x] Re-tested a FAST sync to check that nothing has regressed there and it looks fine
[x] Public sync is looking good as per Matt's comment ^^
[x] I've added some additional tests for the BFTPivotSelectorFromPeers class.

garyschulte commented 3 weeks ago

I won't die on the --fast-sync-min-peers hill, but I would like to see the trie-log-pruning option change move to a separate PR.

Would mark as :ship: with that minor change.

matthew1001 commented 3 weeks ago

Yeah that's reasonable. I'll refactor the trie-log-pruning change into its own PR

matkt commented 3 weeks ago

Completed another holesky sync as of commit 31c0cd3 and it went great.

I think it will be better to test also a mainnet node just to be sure ( we have more pivot block swtich, more interesting usecases, etc). I don't think the modification will add regression but I will prefer to test (checkpoint is enough)

matthew1001 commented 3 weeks ago

I won't die on the --fast-sync-min-peers hill, but I would like to see the trie-log-pruning option change move to a separate PR.

Would mark as 🚢 with that minor change.

The latest commit removes the trie-log change and I'll raise a separate PR for that shortly where any additional discussion can take place

hyperledger / besu

Add experimental `--Xsnapsync-bft-enabled` which enables snap sync for BFT chains #7140

PR description

Documentation