paritytech / substrate

Substrate: The platform for blockchain innovators
Apache License 2.0
8.38k stars 2.65k forks source link

GRANDPA: extend neighbor packet to ensure vote-blocks are picked up by sync #3629

Open rphmeier opened 4 years ago

rphmeier commented 4 years ago

In GRANDPA, voters vote on the chains that they perceive to be the best. In order to process a vote, you have to at least be aware of the header it was cast on and its ancestry. In Substrate, the best way to do that now is to import the header. Thus, to complete a GRANDPA round where voters are fragmented across different forks, we have to ensure that we can eventually sync all of those forks.

The sync service makes a best-effort to download all forks right now, but can't do it perfectly because it doesn't have the following two pieces of information:

GRANDPA peers currently broadcast a neighbor packet to each other containing information about the protocol state. Most importantly round_number and set_id.

This issue proposes extending that neighbor packet with a field known_vote_blocks: Vec<Block::Hash>. This should contain the block hashes that votes (Prevote and Precommit) in the advertised round of the neighbor packet have referenced, which the node has already synced. Essentially, this tells peers which blocks they have which might be necessary in order to make progress.

This list should be bounded by 2*num_voters, but maybe 2*num_voters + 1 (if we include the primary-proposal message, although that's not necessary to make progress).

We should feed all of these block hashes, along with the PeerId of any peers which advertise them, into the sync service. The sync service should attempt to download these chains from the given peer set until told that those block hashes aren't needed anymore (i.e. when we progress onto the next round, we don't need to import those votes anymore).

rphmeier commented 4 years ago

The issue could reasonably be done in two parts:

  1. Adjust the sync service so that it can be informed about forks to sync and who to sync from.
  2. Adjust GRANDPA so it informs the sync service
arkpar commented 4 years ago

I'll make a PR for the sync service API

arkpar commented 4 years ago

3633

rphmeier commented 4 years ago

A couple more things to address:

Also in the GRANDPA side of things, we should not use the sync::sync_fork API unless we have already detected a stall for some time. And when moving on beyond a stuck round, we should drop all the requested forks.

mxinden commented 4 years ago

Thanks for doing part one @arkpar. To ensure we are not duplicating work: I am looking into part two right now.

mxinden commented 4 years ago

Summary of discussions with André and Rob:

Instead of piggybacking on the neighbor messages, we can use the prevote and precommit messages for now to advice the network sync service which blocks to fetch. Receiving a prevote or precommit messages implies that the sender has the given block.

This approach simplifies the implementation without blocking possible future optimizations via the neighbor message. It simplifies the implementation by not requiring additional logic to send out the hashes that we have locally. Thus not requiring to pass down the blockchain client to the gossip validator.

rphmeier commented 4 years ago

Yes - although we will want the neighbor packet change as well as a follow-up.