piplabs / story

Official repo for the Story L1 consensus client, contracts, and associated tooling.
GNU General Public License v3.0
24 stars 11 forks source link

Recover geth and story clients when geth client is upgraded after the upgrade height #144

Open limengformal opened 6 days ago

limengformal commented 6 days ago

Description and context

In the event of a geth hard fork upgrade, if a node is not upgraded after the upgrade block height, the story client may panic since it may not reach consensus with the rest of the nodes that already have their geth clients upgraded.

Only upgrading the geth client at this point doesn't help since the story client already verified/proposed a block that is in conflict. Node at this point can only remove data folder and sync from genesis block which takes a long time.

Suggested solution

One suggested solution is to roll back the block with incorrect proposal so the node can sync the correct block.

Definition of done

Node can roll back incorrect block and restart with correct block in the event of a late geth upgrade.

limengformal commented 4 days ago

How to reproduce:

Install Cast (if not installed yet):

curl -L https://foundry.paradigm.xyz/ | bash
source /home/ec2-user/.zshenv
foundryup
cast call 0x0000000000000000000000000000000000000100  "0x4cee90eb86eaa050036147a12d49004b6b9c72bd725d39d4785011fe190f0b4da73bd4903f0ce3b639bbbf6e8e80d16931ff4bcf5993d58468e8fb19086e8cac36dbcd03009df8c59286b162af3bd7fcc0450c9aa81be5d10d312af6c66b1d604aebd3099c618202fcfe16ae7770b0c49ab5eadf74b754204a3bb6060e44eff37618b065f9832de4ca6ca971a7a1adc826d0f7c00181a5fb2ddf79ae00b4e10e" --rpc-url <node url to 8545 port, e.g. http://localhost:8545/>
zsystm commented 3 hours ago

@limengformal I have successfully reproduced the stuck scenario. image

Currently, I'm working on integrating the CometBFT rollback command into the Story codebase to test if this rollback feature can resolve the issue.