status-im / nimbus-eth2

Nim implementation of the Ethereum Beacon Chain
https://nimbus.guide
Other
527 stars 229 forks source link

Error: Unhandled exception: Asynchronous task [sendMessageSlow() at pubsubpeer.nim:301] was cancelled! [FutureDefect] #6276

Closed celeduc closed 4 months ago

celeduc commented 5 months ago

Describe the bug Nimbus validator node crashed during normal operation. Note strange inclusion of "/nim-testutils/testutils/moduletests.nim(21) moduletests" in stack, as well as "[FutureDefect]" (a self-fulfilling prophecy, that one).

To Reproduce Steps to reproduce the behavior:

  1. Platform details (OS, architecture): Ubuntu 22.04.4 amd64
  2. Branch/commit used: nimbus-eth2_Linux_amd64_24.4.0_f20a21c0; geth-linux-amd64-1.14.0-87246f3c
  3. Commands being executed: ./run-mainnet-beacon-node.sh --rest --subscribe-all-subnets --nat:extip:x.x.x.x --jwt-secret=/home/cleduc/geth-mainnet-secret.jwt --el=http://127.0.0.1:8551 --payload-builder=true --payload-builder-url=http://127.0.0.1:18550 --suggested-fee-recipient=0x... --metrics --metrics-port=8008 --doppelganger-detection=off
  4. Relevant log lines:
    INF 2024-05-08 12:58:03.819+00:00 Attestation sent                         topics="message_router" attestation="(aggregation_bits:0b000000000000000000000000...000, data: (slot: 9029088, index: 3, beacon_block_root: \"6028bdbe\", source: \"282158:d011bbcd\", target: \"282159:6028bdbe\"), signature: \"865edc5d\")" delay=819ms304us800ns subnet_id=3
    peers: 162 > finalized: 5c050910:282157 > head: 6028bdbe:282159:0 > time: 282159:0 (9029088) > sync: synced
    ETH: 32.000112 /home/user/nimbus-eth2/vendor/nim-testutils/testutils/moduletests.nim(21) moduletests
    /home/user/nimbus-eth2/beacon_chain/nimbus.beacon.node.nim(2392) main
    /home/user/nimbus-eth2/beacon_chain/nimbus.beacon.node.nim(2315) handleStartUpCmd
    /home/user/nimbus-eth2/beacon_chain/nimbus.beacon.node.nim(2392) doRunBeaconNode
    /home/user/nimbus-eth2/beacon_chain/nimbus.beacon.node.nim(2392) start
    /home/user/nimbus-eth2/beacon_chain/nimbus.beacon.node.nim(2392) run
    /home/user/nimbus-eth2/vendor/nim-chronos/chronos/internal/asyncengine.nim(150) poll
    Error: Unhandled exception: Asynchronous task [sendMessageSlow() at pubsubpeer.nim:301] was cancelled! [FutureDefect]

Screenshots If applicable, add screenshots to help explain your problem. Screenshot from 2024-05-08 15-40-32

Additional context Node was running uninterrupted through two consecutive sync committees.

tersec commented 5 months ago

https://github.com/vacp2p/nim-libp2p/pull/1094

celeduc commented 5 months ago

What is /home/user/nimbus-eth2/vendor/nim-testutils/testutils/moduletests.nim(21) moduletests doing on the stack?

tersec commented 4 months ago

What is /home/user/nimbus-eth2/vendor/nim-testutils/testutils/moduletests.nim(21) moduletests doing on the stack?

https://github.com/status-im/nimbus-eth2/blob/f20a21c01555b62aa747508cc6bb006a619c6998/config.nims#L150

which comes from https://github.com/status-im/nimbus-eth2/commit/740b76d15292e827218e04b546ab0770825de726#diff-be274e89063d9377278fad5fdcdd936e89d2f32efd7eb8eb8a6a83ac4c711879

celeduc commented 4 months ago

Now a 48h MTBF.

celeduc commented 4 months ago

Still crashing regularly. Any ETA on a release?

tersec commented 4 months ago

Next few days.

celeduc commented 4 months ago

Assuming this is fixed with v24.5.0 ?

tersec commented 4 months ago

It is, yes