paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.com/
1.89k stars 690 forks source link

BEEFY & GRANDPA: consider using a grid topology for gossip #3896

Open sandreim opened 7 months ago

sandreim commented 7 months ago

We already have a battle tested implementation that we use in parachain consensus: approval voting and statement distribution.

https://github.com/paritytech/polkadot-sdk/blob/5638d1a830dc70f56e5fdd7eded21a4f592d382c/polkadot/node/network/protocol/src/grid_topology.rs

This should reduce the load and still provide redundancy and protection against DoS, but it might require some fine tuning.

acatangiu commented 7 months ago

Unfortunately, BEEFY, same as GRANDPA, can’t use any gossip distribution strategies that require knowing the peers dataset. The reason for this is that not only validators participate in this gossip, but all nodes and even light clients; the set of peers is in a continuous flux.

But you do raise a good point that there is definitely room for improvement.

Ideas:

Feedback/ideas are welcomed.

cc @andresilva

burdges commented 7 months ago

That's probably not really true for GRANDPA. The security assumptions rest with the validators, so involving others maybe makes things worse.

Our grid topology is probably not optimal anyways. It's a 2d grid and a super-spammy layer, but they do not unify, which likely fucks things up. We think a better topology would be a 3d grid unified with some random sends, but way fewer than the current random sends. This would increase diameter/hops, but add considerable robusteness, and have many fewer edges than the current topology.

We do not however know this topology is better yet. We also do not know how often the topology should be rerandomized, or what's optimal for set reconsiliation in the gossip. Ain't necessarily worth changing anything until we've more concrete arguments (or discover issues with the current one)

As for this issue, beefy & grandpa must export their results, so any topology they use must feed others, even if the validators do their own thing. This definitely requires doing something beyond what approval checking does.

rphmeier commented 7 months ago

hybrid system, like grid topology for validators and full mesh for everyone else,

Yeah, we've toyed with this idea in the past, there's just not been any impetus for it. You could imagine changing the grid topology logic for each validator node to accept a number of arbitrary "subscriber" peers to whom they send data. Those subscribers can have their own subscribers, and so on. This would be useful for the parachains logic too, as user nodes could inspect the flow of messages across the network.

acatangiu commented 2 months ago

also related https://github.com/paritytech/polkadot-sdk/issues/1123