paritytech / grandpa-bridge-gadget

A Bridge Gadget to Grandpa Finality.
GNU General Public License v3.0
100 stars 38 forks source link

Parachain headers MUST NOT contain validator set changes in the digest #11

Open tomusdrw opened 3 years ago

tomusdrw commented 3 years ago

Currently Substrate spits out the full list of babe validators in the header digest. For parachains and in the bridges context it will get prohibitively expensive to proof anything about such headers, so we should make sure that cumulus does not do that.

CC @andresilva @bkchr

bkchr commented 3 years ago

Why will it be expensive to proof for parachains?

tomusdrw commented 3 years ago

Assuming 1024 validators * 32 bytes = 32KB of just validator set data directly inside the header. To verify anything about a parachain (i.e. storage proof, transaction inclusion proof, etc) you will need to provide the full header content to prove it's actually the one. We could do some pre-processing on the relay chain side to remove this data before inserting to Merkle Tree (which root ends up in the MMR), but this would change the header hash for instance.

bkchr commented 3 years ago

But don't we require these validator set changes in the digest for light clients?

And I don't understand, what ends in the MMR? The hash of the parachain header or the hash of the relay chain header?

tomusdrw commented 3 years ago

MMR will most likely contain a tuple of 3 elements:

  1. Option<MerkleRootOfBeefyPublicKeys> - all BEEFY public keys for upcoming epoch, merkelized. Only present for set transition blocks.
  2. MerkleRootOfParachainHeads - a merkle root hash of either a latest header for every registered parachain or for every parachain that made some progress in current relay chain block (depends on how expensive rebuilding the entire tree is going to be)
  3. ParentBlockHash - a parent hash of the current relay chain block

So if you want to prove to a BEEFY light client that something happened on a parachain you'll need:

  1. ParachainSpecificProof - either a storage proof, a transaction inclusion proof, or a bridge-specific datastructure proof.
  2. ParachainHeader - a full version of parachain header
  3. ParachainInclusionProof - a proof that the parachain head is part of MerkleRootOfParachainHeads for some relay chain block X
  4. MMRProof - MMR proof for block X, containing the full leaf (specifically MerkleRootOfParachainHeads is important to be proven).

We can obviously require a bunch of more bridge-specific changes from parachains - for instance the MMR could only contain some output of bridge-specific data structure living on the Parachain, so that the full head data is not necessary in the proof, but it has quite significant consequences in generality.

But don't we require these validator set changes in the digest for light clients?

That's only partially true, we need some information that will allow Light Clients to verify the new validator set. Returning the entire set is the simplest and most straightforward way, but it would be totally fine if the digest only contained a merkle root of the keys and the full tree would be provided externally.

AFAIU most of the external apps requiring stuff from parachain will rather depend on GRANDPA finality proofs of the relay chain instead of tracking block production on a parachain itself, so for parachain heads these details are most likely redundant.

seunlanlege commented 2 years ago

it would be totally fine if the digest only contained a merkle root of the keys and the full tree would be provided externally.

Right so this would mean that authorities would have to be signaled a diff way, inherents perhaps?

tomusdrw commented 2 years ago

A signal is already there - that's the digest item (containing a root hash). But then when you have the signal you'd need to figure out the validator set from some other place (for instance by querying the state). Inherent could be used for that, but it feels complicated (i.e. the block author would need to add the inherent in case they predict that the rutnime would generate the runtime item in that block. Every block with the digest item and without inherent would need to be considered invalid).

seunlanlege commented 2 years ago

A signal is already there - that's the digest item (containing a root hash)

Yeah i get that, but we need some other way to produce the full list of authorities.

for instance by querying the state

yeah this makes sense, so i guess only changes needed is to change BABE/AURA/Grandpa from depositing full authority set in digests to just depositing merkle roots of this data. There's already storage items that track the next validators, we'd also have to modify the babe/aura/grandpa worker if they rely on that information in the headers.

How would the network handle changes to the header like this? Just out of curiosity

tomusdrw commented 2 years ago

The main challenge with this issue is to make sure that Polkadot and Cumulus are able to handle that, cause AFAIU currently we rely on the full validator set to be in the header digest item.