ethereum / consensus-specs

Ethereum Proof-of-Stake Consensus Specifications
Creative Commons Zero v1.0 Universal
3.58k stars 984 forks source link

Simplifying empty epoch transitions #1340

Open vbuterin opened 5 years ago

vbuterin commented 5 years ago

See also: https://github.com/ethereum/eth2.0-specs/issues/1339

Suppose that you receive a block that specifies a parent 100 epochs in the past, and want to verify it. Currently, this requires processing through 6400 slot transitions (fortunately trivial) but also 100 epoch transitions, each of which take O(N) effort because they involve per-validator work, particularly updating everyone's balances. Hence, the cost of doing this verification may be prohibitive. The largest part of the epoch transition work, computing the compact committees root, is now being removed, but can we simplify things further?

First, let's enumerate all of the O(N) computations that are happening in an empty epoch transition (defined here for simplicity as an epoch transition in a situation where both the current and previous epochs had no blocks). Here is the process_epoch function: https://github.com/ethereum/eth2.0-specs/blob/dev/specs/core/0_beacon-chain.md#epoch-processing

def process_epoch(state: BeaconState) -> None:
    process_justification_and_finalization(state)
    process_crosslinks(state)
    process_rewards_and_penalties(state)
    process_registry_updates(state)
    process_slashings(state)
    process_final_updates(state)

This is essentially the only complex thing happening in 6400 slots worth of transitions; slot transitions are just a small number of hash tree updates each. So let's go through these functions one by one.

process_justification_and_finalization

The only non-O(1) work in this function is computing matching_target_attestations = get_matching_target_attestations(state, previous_epoch) and computing its total balance. However, in the no-block case we can simply hard-code this to [] (and the total balance to 0), and so it is trivial.

process_crosslinks

Here is the entire function:

def process_crosslinks(state: BeaconState) -> None:
    state.previous_crosslinks = [c for c in state.current_crosslinks]
    for epoch in (get_previous_epoch(state), get_current_epoch(state)):
        for offset in range(get_committee_count(state, epoch)):
            shard = Shard((get_start_shard(state, epoch) + offset) % SHARD_COUNT)
            crosslink_committee = set(get_crosslink_committee(state, epoch, shard))
            winning_crosslink, attesting_indices = get_winning_crosslink_and_attesting_indices(state, epoch, shard)
            if 3 * get_total_balance(state, attesting_indices) >= 2 * get_total_balance(state, crosslink_committee):
                state.current_crosslinks[shard] = winning_crosslink

As written, this requires computing every crosslink committee using get_crosslink_committee. However, the only state transition is state.current_crosslinks[shard] = winning_crosslink, which we know will never happen in an empty epoch, so we can just add special case code to bypass this function entirely in that case.

process_registry_updates

This function processes the activation queue, which has a throughput of |V| / 65536 (ie. max 64) validators per epoch. So it's more significant than the truly O(1) functions, but still minor. It also processes the conditions for activation eligibility, hitting max balance, which cannot happen unless other objects such as deposits and transfers are being included in any case, and the condition for exit eligibility, falling below half balance, which will also happen rarely.

process_slashings

get_total_active_balance is O(N), though fortunately it only needs to be run if any post-slashed validators actually are getting processed for post-slashing penalties. This can only happen once per slashed validator, so should happen rarely.

process_final_updates

Aside from code that is going away soon (computing active index and compact committee roots), and O(1) code, the main thing happening here is updates to effective balance. The hysteresis adjustment mechanism ensures that a validator balance change of at most 0.5 ETH is required to trigger an effective balance update. This is likely to be the bulk of the processing other than the process_rewards_and_penalties function below. Fortunately, the proposal given below will also remove the need to run this code.

process_rewards_and_penalties

This involves computing (i) get_attestation_deltas and (ii) get_crosslink_deltas and using the returned rewards and penalties to adjust every validator's balances. This is the bulk of the computational complexity for processing an epoch transition.

There are four rewards/penalties that could be applied to a validator in get_attestation_deltas; three are covered by this code:

    for attestations in (matching_source_attestations, matching_target_attestations, matching_head_attestations):
        unslashed_attesting_indices = get_unslashed_attesting_indices(state, attestations)
        attesting_balance = get_total_balance(state, unslashed_attesting_indices)
        for index in eligible_validator_indices:
            if index in unslashed_attesting_indices:
                rewards[index] += get_base_reward(state, index) * attesting_balance // total_balance
            else:
                penalties[index] += get_base_reward(state, index)

In an empty epoch, unslashed_attesting_indices == [], so all eligible_validator_indices are penalized, where eligible_validator_indices is defined as:

[
        ValidatorIndex(index) for index, v in enumerate(state.validators)
        if is_active_validator(v, previous_epoch) or (v.slashed and previous_epoch + 1 < v.withdrawable_epoch)
    ]

The fourth kind of reward, "Proposer and inclusion delay micro-rewards", can be skipped as get_unslashed_attesting_indices(state, matching_source_attestations) = []. The fourth kind of penalty, the inactivity leak, applies to all eligible_validator_indices not in matching_target_attesting_indices (also an empty list), so it applies to all eligible_validator_indices.

In get_crosslink_deltas, once again no index is attesting so no one is rewarded, but every index in every committee is penalized. Though this is not immediately obvious from looking, every active validator is in exactly one crosslink committee, so every validator in get_active_validator_indices(state, epoch) gets penalized the same amount (NOTE: there is a slight difference between here and the above penalties, in that slashed validators are eligible for rewards in get_attestation_deltas but not here.

Proposed protocol changes

From the above, we can see that in an empty epoch, computing any committee is never actually necessary (except for the very limited case of computing one or a few shuffled indices to determine the proposer); rather, all that we need to do is penalize every validator, and in fact penalize each validator the same portion of their balance.

We can make this considerably easier as follows. First, we can remove the penalties from get_attestation_deltas and get_crosslink_deltas; instead, we can consider adding a third get_penalties function to compute a single penalty that would apply to all active validators (to cancel out the penalties for those validators that participated in each reward-receiving reaction, the rewards for those actions would be increased).

But instead, we can take a more radical path. We add to the state a variable penalty_denominator, which starts off at some constant PENALTY_BASE (eg. 2*20). Instead of applying a penalty of `balance num / den, we setpenalty_denominator += penalty_denominator * num / den. We also abstracteligible_indices` into a function:

def get_eligible_validator_indices(state: BeaconState, epoch: Epoch) -> Sequence[ValidatorIndex]:
    return [
            ValidatorIndex(index) for index, v in enumerate(state.validators)
            if is_active_validator(v, epoch) or (v.slashed and epoch + 1 < v.withdrawable_epoch)
    ]

Then, we add a function to the start of process_block, which says:

def reset_penalty_denominator(state: BeaconState):
    if state.penalty_denominator > PENALTY_BASE:
        for index in get_eligible_validator_indices(state, get_previous_epoch(state)):
            new_balance = state.balances[index] * PENALTY_BASE // state.penalty_denominator
            decrease_balance(validator, state.balances[index] - new_balance)
        state.penalty_denominator = PENALTY_BASE

We also need code in process_epoch to process penalties for validators when they exit the eligible set:

eligible_now = get_eligible_validator_indices(state, get_previous_epoch(state))
eligible_before = get_eligible_validator_indices(state, max(get_previous_epoch(state), 1) - 1)
for index in range(len(state.validators)):
    if index in eligible_before and index not in eligible_now:
        new_balance = state.balances[index] * PENALTY_BASE // state.penalty_denominator
        decrease_balance(validator, state.balances[index] - new_balance)

This last piece of code is O(N) as written, but with data structures can be made to be <= O(log(N)), eg. via a priority queue, triggering only when a validator moves from eligible to ineligible status. reset_penalty_denominator is O(N), but would only run once when a block appears, so would not need to be run per epoch. This way we can massively decrease the cost of processing an epoch transition in the case where an epoch is empty.

JustinDrake commented 5 years ago

get_total_active_balance is O(N)

Can we keep track of (i.e. cache) total balance (incrementally modifying it if the effective balance of a validator changes), making get_total_active_balance O(1)?

vbuterin commented 5 years ago

Yep! That's definitely another important optimization. In fact, during an empty epoch, the only time total active balance should change is when a validator enters or exits the validator set.