ethereum / consensus-specs

Ethereum Proof-of-Stake Consensus Specifications
Creative Commons Zero v1.0 Universal
3.52k stars 953 forks source link

Light client proposal #459

Closed vbuterin closed 5 years ago

vbuterin commented 5 years ago

See for background: https://github.com/ethereum/eth2.0-specs/issues/403

Protocol changes

Version 1b: instead of merkle_tree, use tree_hash_root. The tree hash root has hashed-in metadata about the length of the list, so latest_active_validator_set_lengths becomes superfluous and can be removed.

Light client procedure

Suppose that a light client already authenticated the chain up to some KNOWN_BLOCK, that it considers final. As per the requirements of persistent committees, this means that the light client has the information that it needs to predict persistent committees up to ~9 days in the future (see https://github.com/ethereum/eth2.0-specs/issues/375). The light client will ask for a LightClientSkipProof object for a future block up to ~9 days in the future, which contains the following objects:

The light client privately chooses a shard in [0...SHARD_COUNT-1]. The light client will use the randao mixes, active index roots and active validator set lengths that it can derive from the state root of KNOWN_BLOCK to derive enough information to compute:

Number-theoretic shuffles would let us compute all of this quickly. We

For each i in range(128), it will check:

It will also check the validity of the proofs of possession. Finally, it will compute a group_pubkey from the subset of the validators specified in the participation_bitfield plus each pubkey in the proofs of possession, and BLSVerify(pubkey=group_pubkey, message=hash_tree_root(header), signature=aggregate_signature, domain=PERSISTENT_COMMITTEE_DOMAIN). It also checks that the 1 bits in the participation_bitfield correspond to at least 2/3 of the total balance in the 128 validators selected.

If all checks pass, header becomes the new KNOWN_BLOCK.

Explanation

The idea is simple: use the client's existing knowledge of KNOWN_BLOCK to compute the committee that will be valid in some block up to 9 days in the future, then ask the network to provide Merkle branches proving the indices, and then the pubkeys and the balances of that committee. This gives it the information it needs to compute and verify an aggregate signature.

Light clients can skip ahead as many blocks at a time as they please, either one block at a time or the maximum of 9 days at at time. Note that if a light client is skipping less than 9 days at a time, it may not need new Merkle branches as the committee is the same as before, and so the branches can be omitted, and only a header, bitfield and aggregate signature can be submitted.

Rationale

Complexity analysis

So 250 to 900 kB in total per ~9 days. Compare Bitcoin block headers: 80 bytes per 10 minutes ~= 100 kB per 9 days, or eth1 block headers: 500 bytes per 15 seconds ~= 26 MB per 9 days.

Note also that proofs can be compressed via a SNARK or STARK.

djrtwo commented 5 years ago

closed via #476

djrtwo commented 5 years ago

Reviving this until we port this proposal and explanation somewhere else

zah commented 5 years ago

The above proposal produces a highly predicatable amount of traffic, which is a nice property, but one advantage of the previous approach (based on older ideas of Vitalik presented here) was that the required traffic adapts to the rate of change in the active validator set. If the rate of change is low, the skips can be much longer than 9 days.

In the last few days, I've been looking for new ways to improve the previous scheme and here I'll outline a new algorithm that appears to be very efficient.

Before we start, let's make some observations:

1. If you know the active validator set, you don't need to sync past blocks at all.

A light client is interested only in latest finalized state. Everything else in the state of the beacon chain or the shards can be authenticated through merkle proofs obtained from a LES-like protocol. Thus, if a light client knows the active validator set, it can just listen for attestations and block proposals coming from known validators. With every "vote" seen on the network, the probabilistic assurance of the light client that a certain state is finalized will increase. When the network finalizes a new state, the light client will do the same. Furthermore, the client may be immediately presented with a set of signatures from a super-majority of the known validators that directly authenticates the latest finalized state (more on this later). This would be an instant sync procedure.

On the other hand, if the the light client doesn't know the active validator set, it cannot make any judgments regarding the observed traffic. Thus, the problem of syncing with the network after being offline for a while boils down to syncing the changes in the active validator set.

2. If a super-majority of the known validators have signed any datum in the consensus objects, that's enough for the light client to trust it.

This comes from the economic reasoning used in Casper. Whatever the protocol changes are, we can just adapt the slashing conditions to give assurance to the light client that if a super-majority of the validators have signed something, this is either a finalized consensus or a lot of purchasing power is being destroyed.

I'll use this argument to make the claim that if the light client is presented with a plausible history about how the validator set changed, it's enough to authenticate the end result that is signed by a super-majority of the validators and the history itself may be transmitted or compressed in arbitrary ways. Thus, the syncing of the validator set changes is greatly simplified if the consensus objects somehow include signatures of all validators authenticating the validator set's contents at a particular time.

Proposed solution

A key technical insight for this new proposal is that the BLS signature aggregation allows us not only to have multiple signers, but also to have multiple signed messages without increasing the signature size (the cost is only increased signing time; the verification time stays the same).

This proposal is a high-level description of how the light client sync would work. If there is interest, I can work on the concrete details or I can leave this to the EF research team. In Nimbus, we hope to implement proof-of-concept implementations of all viable light client proposals in the near future in order to compare the practical results of using them.

Suggested changes:

Using the key insight above, change the signatures in all block proposals and all attestations to also include a signature over the root merkle hash of the validator set in the latest finalized state.

The goal here is to accumulate signatures authenticating the contents of the validator set. If the light client knowns a certain finalized validator set VS_prev, it can be presented with a sync proof that the set changed to VS_next. The proof includes the following information:

  1. A minimal list of exits (list of indices).

  2. A minimal list of new deposits (list of <pub key, index> pairs)

  3. A BLS signature that includes at least a super-majority of known validators at VS_prev, signing VS_next.

  4. A bitmask (or a list of signature counts per validator if aggregating just one signature per validator proves to be difficult). Please note that this list will also refer to new validators specified in (2).

  5. A minimal list of transient validators whose signatures got aggregated in (3). A transient validator is one who made a deposit and an exit between VS_prev and VS_next. This list is optional and it will be required only if it proves hard to produce an aggregated signature from the minimal set of required validators (more on this below).

  6. A list of other data that ended up in the BLS signature (this should encode a tree effectively. @AlexeyAkhunov's encoding scheme comes to mind). The client doesn't need to authenticate this data, but it's needed in order to verify the aggregated signature. There is a trade-off here - we can optimize the traffic needed for light clients by storing the validator set signatures separately in the attestations or block proposals, which will increase the network traffic a bit for everyone else.

Please note that the distance between VS_prev and VS_next can be quite significant. Such a proof needs to be generated roughly when 1/3 of the validators from VS_prev are no longer active. Furthermore, the same proof can be presented to multiple clients as long as their known validators sets are roughly the same. This suggests a design where the light client servers will maintain in persistent storage a set of proofs generated at particular times (let's say at T1, T2, T3, T4 ..) and if a light client is at TC (T1 < TC < T2), the server will just have to serve the precomputed proofs at T2, T3, T4 and so on.

Once the client has synced the validator set, it can use the methods described in observation (1) to arrive at a new finalized state. To get the "instant sync" scheme working, we can decide that the proposals and attestations will include signatures over the root hash of the entire finalized state. This will slightly increase the size of the sync proof above because the validator set will be authenticated with a merkle proof now.

How to generate a small sync proof?

Generating a small sync proof is a relatively tough computational problem. Once a proof is generated, it's easy to verify that it has the desired properties, so operators of light client servers may decide to somehow share the burden of generating it or to outsource the work altogether. Here are some important considerations:

1) If you know as many individual signatures as possible, you have more freedom to aggregate them in more minimal configurations. The proof generator will be trying to monitor the whole network and it will try to keep a history of all observed individual signatures.

2) The generated sync proof at T2 must be such that every single finalized state between T1 and T2 has a validator set for which the signature at T2 represents a super-majority. This suggests a design where the proof generator keeps track of the best possible T_N, T_N+1, T_N+2, ... over time and these intervals may shift upon certain circumstances. Some heuristic will be applied to determine when the solution being found is "good enough".

I imagine that such proof-generating servers will be ran by few operators and the light-client server operators will be just fetching proofs from them.

Complexity analysis

I need to put more time to produce concrete numbers here and obviously the practical cost of the approach will depend on the actual rate of change in the validator set for which we need real-world statistics, but the hand-wavy version of the analysis is that the sync proofs will be smaller objects than the 9-day proofs suggested above, the skip interval should be much higher than 9 days and most of the time it would be possible to use the "instant sync" procedure (e.g. if the light client connects to the network once every couple of weeks).

djrtwo commented 5 years ago

The core of this protocol was spec'd out in the repo. Do we want to leave this open for rationale/analysis?