prysmaticlabs / prysm

Go implementation of Ethereum proof of stake
https://www.offchainlabs.com
GNU General Public License v3.0
3.46k stars 998 forks source link

Local devnet blocked when running for too long. #14042

Closed mask-pp closed 2 months ago

mask-pp commented 5 months ago

Describe the bug

Based on this repo. After running about 1~2 weeks it will blocked and appear error.

And this is my config content: ` CONFIG_NAME: interop PRESET_BASE: interop

Genesis

GENESIS_FORK_VERSION: 0x20000089

Altair

ALTAIR_FORK_EPOCH: 0 ALTAIR_FORK_VERSION: 0x20000090

Merge

BELLATRIX_FORK_EPOCH: 0 BELLATRIX_FORK_VERSION: 0x20000091 TERMINAL_TOTAL_DIFFICULTY: 0

Capella

CAPELLA_FORK_EPOCH: 0 CAPELLA_FORK_VERSION: 0x20000092 MAX_WITHDRAWALS_PER_PAYLOAD: 16

DENEB_FORK_EPOCH: 0 DENEB_FORK_VERSION: 0x20000093

Time parameters

SECONDS_PER_SLOT: 2 SLOTS_PER_EPOCH: 32

Deposit contract

DEPOSIT_CONTRACT_ADDRESS: 0x4242424242424242424242424242424242424242

`

Has this worked before in a previous version?

Used stable docker branch(gcr.io/prysmaticlabs/prysm/validator:stable) for several months. This error still appears regularly.

🔬 Minimal Reproduction

No response

Error

error content:
`
level=info msg=\"Begin building block\" prefix=\"rpc/validator\" sinceSlotStartTime=440.698708ms slot=44687",

level=error msg=\"Could not compute head from new attestations\" error=\"0x018d1149dfba4ba1f6bfc84fbb353888994842d4e43ceac505f814b1f1ad63f3: unknown justified root\" prefix=blockchain",

level=error msg=\"Received nil payload ID on VALID engine response\" blockHash=0x475b3ff306bed7275a036ddb2a16807488e29c787e1323ab248615966b3fba39 prefix=blockchain slot=44544",
`

Platform(s)

Linux (x86), Mac (Apple Silicon)

What version of Prysm are you running? (Which release)

No response

Anything else relevant (validator index / public key)?

No response

mask-pp commented 5 months ago

@prestonvanloon I'm not sure if this is a bug, but this kind of problem occurs after running for a long time. I would be grateful if I could get your help :)

lzmrd commented 4 months ago

Hi! I guess this is due to a software update. Infact, if you try to follow that tutorial today you will geth an error

allnil commented 4 months ago

any possible fixes?

prestonvanloon commented 4 months ago

We are going to need more info than what is provided to troubleshoot this. It won't be feasible for us to run a local devnet for more than 1 week to see if we can reproduce it.

Two seconds per slot is very tight. Maybe there is a deadlock condition you are hitting. Hard to tell with three lines of logs and a config.

CoNETProject commented 2 months ago

We running beacon chain over 62 days. It has over 70 nodes to running GETH & Prysm (beacon & validator) Very similar to your setup. Over 200K wallet address & over 700K transactions. https://scanold.conet.network

Until I copy a full node's all files to another server and running it. (I want develop more RPC to service our client) Two same key pair nodes let beacon chain go to "slasher" and stoped create new block even we have 70 nodes.

I tried to let beacon chain return create new block. But It looks impossible. We give up.

Create a new chain. Https://scan.conet.network

CoNETProject commented 2 months ago

Our new chain was stoped. This is the information.

sg="Chain reorg occurred" commonAncestorRoot=0x88bdc231adf8d4d02e8bbbb4a3bdc2074f66ce99cd4fd795e096e00eff864541 depth=7 distance=13 newRoot=0x9aad4d8dcda0ebdc7db3059eb008e8a511ac5c62755a1d9fe52b606bf46a1c97 newSlot=18436 newWeight=32000000000 oldRoot=0x8dc0d431d1d06e8214370ddc35464d23c422662705d5e7f2e389ffd5c3020035 oldSlot=18435 oldWeight=12800000000 prefix=blockchain
time="2024-08-08 15:35:22" level=info msg="Updated fee recipient addresses for validator indices" prefix="rpc/validator" validatorCount=1
time="2024-08-08 15:35:22" level=info msg="Attempted late block reorg aborted due to attestations at 12 seconds" prefix=blockchain root=0xdd1c4968e72500ce42e16cc687ff3b1e2ddc0ac2be94c5a07154226c55880d20 weight=0
time="2024-08-08 15:35:22" level=info msg="Chain reorg occurred" commonAncestorRoot=0x88bdc231adf8d4d02e8bbbb4a3bdc2074f66ce99cd4fd795e096e00eff864541 depth=8 distance=15 newRoot=0xdd1c4968e72500ce42e16cc687ff3b1e2ddc0ac2be94c5a07154226c55880d20 newSlot=18437 newWeight=0 oldRoot=0x9aad4d8dcda0ebdc7db3059eb008e8a511ac5c62755a1d9fe52b606bf46a1c97 oldSlot=18436 oldWeight=32000000000 prefix=blockchain
time="2024-08-08 15:35:46" level=info msg="Begin building block" prefix="rpc/validator" sinceSlotStartTime=164.539645ms slot=18440
time="2024-08-08 15:35:46" level=warning msg="could not find tracked proposer index" headRoot=0xdd1c4968e72500ce42e16cc687ff3b1e2ddc0ac2be94c5a07154226c55880d20 slot=18440 validatorIndex=1
time="2024-08-08 15:35:46" level=info msg="Finished building block" prefix="rpc/validator" sinceSlotStartTime=284.128387ms slot=18440 validator=1
time="2024-08-08 15:36:10" level=info msg="Begin building block" prefix="rpc/validator" sinceSlotStartTime=115.806421ms slot=18442
time="2024-08-08 15:36:10" level=warning msg="could not find tracked proposer index" headRoot=0xdd1c4968e72500ce42e16cc687ff3b1e2ddc0ac2be94c5a07154226c55880d20 slot=18442 validatorIndex=3
time="2024-08-08 15:36:10" level=info msg="Finished building block" prefix="rpc/validator" sinceSlotStartTime=214.909513ms slot=18442 validator=3
time="2024-08-08 15:36:15" level=info msg="Connected peers" inboundTCP=4 outboundTCP=2 prefix=p2p total=6
time="2024-08-08 15:36:34" level=warning msg="Subnet weight is 0, skipping initializing topic scoring" prefix=p2p
time="2024-08-08 15:36:34" level=info msg="Subscribed to" prefix=sync topic="/eth2/9a0b2e01/beacon_attestation_4/ssz_snappy"
time="2024-08-08 15:36:34" level=info msg="Updated fee recipient addresses for validator indices" prefix="rpc/validator" validatorCount=1
time="2024-08-08 15:36:35" level=error msg="Could not handle p2p pubsub" error="failed to validate consensus state transition function: could not batch verify signature: some signatures are invalid. details:
signature 'attestation signature' is invalid. signature: 0x922cc51fc47bf7e28a9aeaef049b7bfd7202b9c8d19560ff92a547f6288c71fbc95076cacfd4549850de27d27a1728680cbc4997a116adb9fe8a6ef08de540544d61fbb5efeef59efd33a7f455eb40f8c787607dc4ec240306dd1a2e173dd32c, public key: 0xa3a32b0f8b4ddb83f1a0a853d81dd725dfe577d4f4c3db8ece52ce2b026eca84815c1a7e8e92a4de3d755733bf7e4a9b, message: 0x1d2dead6395a9dae0c4fc168dd49f4d5918982c28f18f043a6e22f67912ccf4a" prefix=sync topic="/eth2/9a0b2e01/beacon_block/ssz_snappy"
time="2024-08-08 15:36:47" level=error msg="Could not handle p2p pubsub" error="failed to validate consensus state transition function: could not batch verify signature: some signatures are invalid. details:
signature 'attestation signature' is invalid. signature: 0x83f47c61494996ccdf3f8e6f9daad285ce61e988e2ef10773eb3e9ad1d8e5c2907c99ddee39af058ff9b324e5a26c8df04742f07a64aae6bb4999cab63aaa0b7c91d2b1745a12fc3486bdd65aa1a900fd46542a450d590171ad8c84e55d22881, public key: 0xab0bdda0f85f842f431beaccf1250bf1fd7ba51b4100fd64364b6401fda85bb0069b3e715b58819684e7fc0b10a72a34, message: 0x3b766808b5245d13e7048b764601591f5530312540eb80d71d95ab29353fe6f9
signature 'attestation signature' is invalid. signature: 0xb38b0f8373e29aedb6a91d12e71e3349f6e75b06cdbdc3213f7926cbf22b4a2d6f9fc66c62589fdecf6f7733dda594cc17e5a6ec448192241be04ab6ce55738912e4dede88075e0b6c2b4dccc50dd98d278276d2ea802ca0afcb3c05d7ae5386, public key: 0x81283b7a20e1ca460ebd9bbd77005d557370cabb1f9a44f530c4c4c66230f675f8df8b4c2818851aa7d77a80ca5a4a5e, message: 0xb9aaab53d45cd80a8ca20f93b4b8d89c08383a540eba905561e36a29bf1abf08
signature 'attestation signature' is invalid. signature: 0x911b8fa639d2301c8a4cf22af7446bf62211987f06c37b59439b9fa2940e66550af827a22f580dcb26f627639769a225046770bc3b70dfbc3ca6a373ba9337409b466e3590b53e7e195fe42b6218716e349c4c8eaa0b3abc74ef1acedace79f1, public key: 0xa99a76ed7796f7be22d5b7e85deeb7c5677e88e511e0b337618f8c4eb61349b4bf2d153f649f7b53359fe8b94a38e44c, message: 0x844f56f6dec3e44a3b87d4ff68583206cf37980c657c1b1ae5fad23ddc8648fe" prefix=sync topic="/eth2/9a0b2e01/beacon_block/ssz_snappy"
time="2024-08-08 15:36:50" level=info msg="Forkchoice updated with payload attributes for proposal" blockRoot=0xdd1c4968e725 headSlot=18437 payloadID=0x037d0c34d1e1 prefix=blockchain
time="2024-08-08 15:36:58" level=info msg="Begin building block" prefix="rpc/validator" sinceSlotStartTime=136.636169ms slot=18446
time="2024-08-08 15:36:58" level=info msg="Finished building block" prefix="rpc/validator" sinceSlotStartTime=269.56548ms slot=18446 validator=2
time="2024-08-08 15:37:15" level=info msg="Connected peers" inboundTCP=0 outboundTCP=1 prefix=p2p total=1
time="2024-08-08 15:37:26" level=info msg="Forkchoice updated with payload attributes for proposal" blockRoot=0xdd1c4968e725 headSlot=18437 payloadID=0x03682228f713 prefix=blockchain
time="2024-08-08 15:37:34" level=info msg="Begin building block" prefix="rpc/validator" sinceSlotStartTime=187.466088ms slot=18449
time="2024-08-08 15:37:34" level=info msg="Finished building block" prefix="rpc/validator" sinceSlotStartTime=363.224459ms slot=18449 validator=2
time="2024-08-08 15:37:38" level=info msg="Forkchoice updated with payload attributes for proposal" blockRoot=0xdd1c4968e725 headSlot=18437 payloadID=0x038d369d5cb3 prefix=blockchain
time="2024-08-08 15:37:38" level=error msg="Failed to broadcast sync committee message" error="could not publish message: unable to find requisite number of peers for topic /eth2/9a0b2e01/sync_committee_0/ssz_snappy, 0 peers found to publish to: context deadline exceeded" prefix=p2p
time="2024-08-08 15:37:38" level=error msg="Failed to broadcast sync committee message" error="could not publish message: unable to find requisite number of peers for topic /eth2/9a0b2e01/sync_committee_0/ssz_snappy, 0 peers found to publish to: context deadline exceeded" prefix=p2p
time="2024-08-08 15:37:38" level=error msg="Failed to broadcast sync committee message" error="could not publish message: unable to find requisite number of peers for topic /eth2/9a0b2e01/sync_committee_3/ssz_snappy, 0 peers found to publish to: context deadline exceeded" prefix=p2p
time="2024-08-08 15:37:38" level=error msg="Failed to broadcast sync committee message" error="could not publish message: unable to find requisite number of peers for topic /eth2/9a0b2e01/sync_committee_0/ssz_snappy, 0 peers found to publish to: context deadline exceeded" prefix=p2p
time="2024-08-08 15:37:38" level=error msg="Failed to broadcast sync committee message" error="could not publish message: unable to find requisite number of peers for topic /eth2/9a0b2e01/sync_committee_0/ssz_snappy, 0 peers found to publish to: context deadline exceeded" prefix=p2p
mask-pp commented 2 months ago

It was caused by the restart of k8s, which was caused by our team's operational error. Sorry for taking up your resources.