ChainSafe / lodestar

🌟 TypeScript Implementation of Ethereum Consensus
https://lodestar.chainsafe.io
Apache License 2.0
1.16k stars 284 forks source link

Finalized header not updated during sync committee change #6932

Open nflaig opened 3 months ago

nflaig commented 3 months ago

Based on the logs from https://github.com/ChainSafe/lodestar/issues/6861#issuecomment-2205800172, it seems like the light client server is not updating the finality header if sync committee changes, i.e. during transition in new sync committee period.

Looking at logs from beacon-2024-07-01.log.gz

New sync committee period started

Jul-01 22:40:29.001[]                 info: New sync committee period 1150

It seems like we are storing new partial updates but all are not finalized

Jul-01 22:40:37.174[chain]           debug: Stored new PartialLightClientUpdate syncPeriod=1150, isFinalized=false, participation=0.900390625
Jul-01 22:40:50.201[chain]           debug: Stored new PartialLightClientUpdate syncPeriod=1150, isFinalized=false, participation=0.98828125
Jul-01 22:41:12.950[chain]           debug: Stored new PartialLightClientUpdate syncPeriod=1150, isFinalized=false, participation=0.990234375
Jul-01 22:41:49.108[chain]           debug: Stored new PartialLightClientUpdate syncPeriod=1150, isFinalized=false, participation=0.9921875
Jul-01 22:42:13.916[chain]           debug: Stored new PartialLightClientUpdate syncPeriod=1150, isFinalized=false, participation=0.99609375
Jul-01 22:42:24.880[chain]           debug: Stored new PartialLightClientUpdate syncPeriod=1150, isFinalized=false, participation=0.998046875

and next sync committee period

Jul-01 22:40:26.647[chain]           debug: Stored nextSyncCommittee period=1150, slot=9420800, dependentRoot=0x1b67d2012d73c2ff7cc81ed2500114c0d37378c586d075eb8db3ff66da24f82e

My best guess right now is that this is due to the check here https://github.com/ChainSafe/lodestar/blob/dbaa46e24a3d8f8dc889f9aac6fc73f31290d73e/packages/beacon-node/src/chain/lightClient/index.ts#L431 as sync period of finalized checkpoint would be 1149 while current period is 1150 which means we are storing it as isFinalized=false

nflaig commented 3 months ago

Confirmed this issue on our mainnet node as well, the finalized_header is not updated for 2 epochs after new sync committee period starts.

Those apis return different finalized slots

curl -s https://lodestar-mainnet.chainsafe.io/eth/v1/beacon/light_client/finality_update | jq -r .data.finalized_header.beacon.slot
9445280

curl -s https://lodestar-mainnet.chainsafe.io/eth/v2/beacon/blocks/finalized | jq -r .data.message.slot
9445343

The light client server is lacking behind.

We have two places in the code where we compare the sync committee period which I think is the reason for this https://github.com/ChainSafe/lodestar/blob/dbaa46e24a3d8f8dc889f9aac6fc73f31290d73e/packages/beacon-node/src/chain/lightClient/index.ts#L431 https://github.com/ChainSafe/lodestar/blob/b6e20771c190f93465b3ea51bbd36f7e2e90ec21/packages/beacon-node/src/chain/lightClient/index.ts#L637

From the light client spec it's not clear why this check is required.

There is a note about when to update

LightClientUpdate are only considered if compute_sync_committee_period_at_slot(update.attested_header.beacon.slot) == compute_sync_committee_period_at_slot(update.signature_slot)

But this compared the sync period of attested_header slot and not finalized_header.

There is no such check when setting finalized header

# Indicate finality whenever possible
    if finalized_block is not None:
        if finalized_block.message.slot != GENESIS_SLOT:
            update.finalized_header = block_to_light_client_header(finalized_block)
            assert hash_tree_root(update.finalized_header.beacon) == attested_state.finalized_checkpoint.root
        else:
            assert attested_state.finalized_checkpoint.root == Bytes32()
        update.finality_branch = FinalityBranch(
            compute_merkle_proof(attested_state, FINALIZED_ROOT_GINDEX))

Maybe I am missing something from the spec, but our implementation looks incorrect to me