Open ncavedale-xlabs opened 4 months ago
Hi @ncavedale-xlabs can you try set debug.vmodule("eth/*=6,p2p=6")
via geth console and provide me the logs when this happens again?
I got the same error after upgrading to version 5.6 - seems like the chain id is wrong for some reason.
trace output:
Aug 16 07:28:58 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:58.419] Dial error id=69340d7cd19d5c58 addr=52.74.58.167:32213 conn=dyndial err="i/o timeout"
Aug 16 07:28:58 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:58.419] Dial error id=66591c6913b192e7 addr=13.43.97.33:32370 conn=dyndial err="i/o timeout"
Aug 16 07:28:58 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:58.546] Failed RLPx handshake addr=3.234.173.24:30303 conn=dyndial err=EOF
Aug 16 07:28:58 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:58.549] Accepted connection addr=89.116.24.44:47946
Aug 16 07:28:58 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:58.580] Rejected peer id=555ee9df9c6e75de addr=89.116.24.44:47946 conn=inbound err="useless peer"
Aug 16 07:28:58 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:58.732] Failed RLPx handshake addr=138.199.25.14:30303 conn=dyndial err=EOF
Aug 16 07:28:58 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:58.750] Failed RLPx handshake addr=138.199.25.14:30303 conn=dyndial err=EOF
Aug 16 07:28:58 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:58.803] Failed RLPx handshake addr=107.21.220.181:30303 conn=dyndial err=EOF
Aug 16 07:28:58 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:58.991] Accepted connection addr=18.139.114.143:46858
Aug 16 07:28:59 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:59.251] Rejected peer id=548938a66ad90bbf addr=18.139.114.143:46858 conn=inbound err="useless peer"
Aug 16 07:28:59 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:59.282] Accepted connection addr=18.171.67.70:15819
Aug 16 07:28:59 axelar-archive nccc_geth[3393286]: DEBUG[08-16|07:28:59.319] Adding p2p peer peercount=2 id=5e3d40fba57c443f conn=inbound addr=18.171.67.70:15819 name=parchain/v1.11.3-uns...
Aug 16 07:28:59 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:59.319] Starting protocol eth/68 id=5e3d40fba57c443f conn=inbound
Aug 16 07:28:59 axelar-archive nccc_geth[3393286]: DEBUG[08-16|07:28:59.319] Ethereum handshake failed id=5e3d40fba57c443f conn=inbound err="network ID mismatch: 600002 (!= 534352)"
Aug 16 07:28:59 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:59.319] Protocol eth/68 failed id=5e3d40fba57c443f conn=inbound err="network ID mismatch: 600002 (!= 534352)"
Aug 16 07:28:59 axelar-archive nccc_geth[3393286]: DEBUG[08-16|07:28:59.319] Removing p2p peer peercount=1 id=5e3d40fba57c443f duration="183.352µs" req=false err="network ID mismatch: 600002 (!= 534352)"
Aug 16 07:28:59 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:59.401] Accepted connection addr=64.176.56.219:41124
Aug 16 07:28:59 axelar-archive nccc_geth[3393286]: DEBUG[08-16|07:28:59.704] Adding p2p peer peercount=2 id=72f64269f4e2bf47 conn=inbound addr=64.176.56.219:41124 name=Geth/v1.3.4-stable-a...
Aug 16 07:28:59 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:59.704] Starting protocol eth/68 id=72f64269f4e2bf47 conn=inbound
Aug 16 07:28:59 axelar-archive nccc_geth[3393286]: DEBUG[08-16|07:28:59.704] Ethereum handshake failed id=72f64269f4e2bf47 conn=inbound err="network ID mismatch: 4058 (!= 534352)"
Aug 16 07:28:59 axelar-archive nccc_geth[3393286]: TRACE[08-16|07:28:59.704] Protocol eth/68 failed id=72f64269f4e2bf47 conn=inbound err="network ID mismatch: 4058 (!= 534352)"
Aug 16 07:28:59 axelar-archive nccc_geth[3393286]: DEBUG[08-16|07:28:59.704] Removing p2p peer peercount=1 id=72f64269f4e2bf47 duration="271.677µs" req=false err="network ID mismatch: 4058 (!= 534352)"
Same error with 5.6
System information
One of our testnet nodes has seemingly lost connection to all peers and cannot reconnect. As a result, the node was not getting new blocks. It remained in this state for several hours, until we restarted the service and then it was able to catch up normally. This is the second time it happens to us, just don't remember if the first one was also on testnet or mainnet.
Expected behaviour
Node should be able to find/reconnect to peers and keep syncing
Actual behaviour
Node loses connection to peers and doesn't find/reconnect to any peer until
l2geth
is restartedSteps to reproduce the behaviour
N/A
Backtrace
Node was importing new chain segments until it wasn't: