prysmaticlabs / prysm

Go implementation of Ethereum proof of stake
https://www.offchainlabs.com
GNU General Public License v3.0
3.47k stars 999 forks source link

"Waiting for enough suitable peers before syncing" prefix=initial-sync required=3 suitable=0 #10920

Closed duktig666 closed 1 year ago

duktig666 commented 2 years ago

🐞 Bug Report

Description

beacon-node sync question. Syncing for about 10 minutes causes this problem:

"Peer summary" activePeers=2 inbound=0 outbound=2 prefix=p2p
"Waiting for enough suitable peers before syncing" prefix=initial-sync required=3 suitable=0

some info:

time="2022-06-21 11:29:58" level=warning msg="Running on the Prater Testnet" prefix=flags
time="2022-06-21 11:29:58" level=info msg="Checking DB" database-path="/home/prysm/kotal-data/beaconchaindata" prefix=node
time="2022-06-21 11:29:58" level=info msg="Deposit contract: 0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b" prefix=node
time="2022-06-21 11:29:58" level=info msg="Waiting for state to be initialized" prefix=initial-sync
time="2022-06-21 11:29:58" level=info msg="Starting beacon node" prefix=node version="Prysm/Unknown/. Built at: 2021-12-22 16:17:22+00:00"
time="2022-06-21 11:29:58" level=info msg="Waiting to reach the validator deposit threshold to start the beacon chain..." prefix=blockchain
time="2022-06-21 11:29:58" level=info msg="gRPC server listening on port" address="0.0.0.0:8888" prefix=rpc
time="2022-06-21 11:29:58" level=info msg="Starting API middleware" prefix=gateway
time="2022-06-21 11:29:58" level=info msg="Starting gRPC gateway" address="0.0.0.0:9999" prefix=gateway
time="2022-06-21 11:29:58" level=info msg="Connected to eth1 proof-of-work chain" endpoint="http://ethereum-goerli-api:8599" prefix=powchain
time="2022-06-21 11:30:02" level=info msg="gRPC client connected to beacon node" addr="100.168.170.108:58762" prefix=rpc
time="2022-06-21 11:34:23" level=info msg="Processing deposits from Ethereum 1 chain" deposits=512 genesisValidators=507 prefix=powchain
time="2022-06-21 11:37:31" level=info msg="Processing deposits from Ethereum 1 chain" deposits=1024 genesisValidators=1007 prefix=powchain
time="2022-06-21 11:37:49" level=info msg="Processing deposits from Ethereum 1 chain" deposits=1536 genesisValidators=1518 prefix=powchain
................
genesisValidators=4034 prefix=powchain
time="2022-06-21 11:40:51" level=info msg="Processing deposits from Ethereum 1 chain" deposits=4608 genesisValidators=4546 prefix=powchain
time="2022-06-21 11:40:53" level=info msg="Processing deposits from Ethereum 1 chain" deposits=5120 genesisValidators=5057 prefix=powchain

sync "deposits=16384" case this question

[2022-06-21 09:42:11]  INFO powchain: Processing deposits from Ethereum 1 chain deposits=15360 genesisValidators=15157
[2022-06-21 09:42:12]  INFO powchain: Processing deposits from Ethereum 1 chain deposits=15872 genesisValidators=15669
[2022-06-21 09:42:14]  INFO powchain: Processing deposits from Ethereum 1 chain deposits=16384 genesisValidators=16181
[2022-06-21 09:42:14]  INFO powchain: Minimum number of validators reached for beacon-chain to start ChainStartTime=2021-10-02 22:06:22 +0000 UTC
[2022-06-21 09:42:14]  WARN blockchain: ProcessAttestations routine waiting for genesis time
[2022-06-21 09:42:15]  INFO blockchain: Initialized beacon chain genesis state
[2022-06-21 09:42:15]  INFO slotutil: Chain genesis time reached genesisStateRoot=41623786591fc8d9ae48eb4adab15421131d0fa91c8ff9ff861cadf57144ffb6 genesisTime=2021-10-02 22:06:22 +0000 UTC genesisValidators=16393
[2022-06-21 09:42:15]  INFO initial-sync: Starting initial chain sync...
[2022-06-21 09:42:15]  INFO initial-sync: Waiting for enough suitable peers before syncing required=3 suitable=0
[2022-06-21 09:42:15]  INFO p2p: Started discovery v5 ENR=enr:-MK4QBlgJhNolh36QDew4vkvlQwDBYSXsGHeNaZXaytIYaUaKHy2NsSqYy_wN2rn1o-d110mSebf7pmjYW2CfY7eki2GAYGFpCGUh2F0dG5ldHOIAAAAAAAAAACEZXRoMpB1IkQuAgAQIP__________gmlkgnY0gmlwhKwRAAWJc2VjcDI1NmsxoQP5gjbWoKR0qeIFkbARjCNtWaSmyRa4-GtBfYh_Z61S8YhzeW5jbmV0cwCDdGNwgjLIg3VkcIIu4A
[2022-06-21 09:42:15]  INFO p2p: Node started p2p server multiAddr=/ip4/172.17.0.5/tcp/13000/p2p/16Uiu2HAmVSy2fRbfnXngdMPTux6rZLx79Pg62uWtcMt6SnQxMxoS
[2022-06-21 09:42:15]  WARN blockchain: Genesis time received, now available to process attestations
[2022-06-21 09:42:20]  INFO initial-sync: Waiting for enough suitable peers before syncing required=3 suitable=0
[2022-06-21 09:42:25]  INFO initial-sync: Waiting for enough suitable peers before syncing required=3 suitable=0

πŸ”¬ Minimal Reproduction

πŸ”₯ Error

beacon-node :

"Peer summary" activePeers=2 inbound=0 outbound=2 prefix=p2p
"Waiting for enough suitable peers before syncing" prefix=initial-sync required=3 suitable=0

validator:

time="2022-06-22 03:14:35" level=info msg="Waiting for beacon node to sync to latest chain head" prefix=validator
--
Wed, Jun 22 2022 11:14:41 am | time="2022-06-22 03:14:41" level=info msg="Waiting for beacon node to sync to latest chain head" prefix=validator
Wed, Jun 22 2022 11:14:47 am | time="2022-06-22 03:14:47" level=info msg="Waiting for beacon node to sync to latest chain head" prefix=validator

🌍 Your Environment

Operating System:

  
   ubuntu   
  

What version of Prysm are you running? (Which release)

  
   2.1.2
  

Importantly

see issue:

But there is no problem to solve the substance.

ChiefSgt commented 2 years ago

I had exactly the same problem for a couple of days setting up a new server to take part in the upcoming Goerli/Prater merge testing.

Geth synced fine with Goerli chain and Beacon connected to local Geth and processed deposits before it went into a "Waiting for enough suitable peers" cycle for over 24 hours. It seem to never reach 3 peers which is required.

Tested older versions of Prysm, --bootstrap-node parameter, delete beacon folder and resync a couple of times.

My solution was in the end to use an old Prater genesis file.

  1. Delete all files in beacon folder.
  2. Download genesis file from https://github.com/eth-clients/eth2-networks/tree/master/shared/prater
  3. Added --genesis-state /path/to/genesis.ssz to beacon chain config and started up sync again.

I hope this information can help to narrow down the cause for this, and help others in the same situation.

Test system Ubuntu 22.04 (i7-4790, 32GB RAM, 512GB SSD) Geth 1.10.20 Prysm 2.1.3 (official release amd64 Linux binary)

Beacon-chain config ExecStart=/usr/local/bin/beacon-chain --datadir /var/lib/prysm/beacon --http-web3provider http://localhost:8545 --jwt-secret /var/lib/secrets/jwtsecret --suggested-fee-recipient --enable-vectorized-htr --accept-terms-of-use --p2p-tcp-port 13001 --p2p-udp-port 12001 --prater

JustinAvaLabs commented 2 years ago

I'm still running in to this issue. With Goerli and Mainnet and Sepolia.

Running from a kubernetes cluster in EKS. The same helm chart/config works fine in 3 of my other clusters, but constantly fails in one. I've tried destroying the data volume, setting "--force-clear-db", and restarting multiple times with different chains.

Below is the trace log. I've confirmed the genesis is correct, the Geth node is running normally, and I'm able to telnet to random peers :13000.

I notice the "WHOAREYOU" step never happens.

Any ideas on how I can further debug this?

time="2022-07-28 15:58:11" level=info msg="Raised fd limit to 1048576 from 1048576"
time="2022-07-28 15:58:11" level=warning msg="Running on the Prater Testnet" prefix=flags
time="2022-07-28 15:58:11" level=debug msg="Starting DB" prefix=node
time="2022-07-28 15:58:11" level=info msg="Checking DB" database-path="/root/.eth2/beaconchaindata" prefix=node
time="2022-07-28 15:58:11" level=info msg="Opening Bolt DB at /root/.eth2/beaconchaindata/beaconchain.db" prefix=db
time="2022-07-28 15:58:11" level=warning msg="Removing database" prefix=node
time="2022-07-28 15:58:11" level=info msg="Opening Bolt DB at /root/.eth2/beaconchaindata/beaconchain.db" prefix=db
time="2022-07-28 15:58:13" level=info msg="Deposit contract: 0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b" prefix=node
time="2022-07-28 15:58:13" level=debug msg="Starting Slashing DB" prefix=node
time="2022-07-28 15:58:13" level=debug msg="Starting State Gen" prefix=node
time="2022-07-28 15:58:13" level=debug msg="Registering P2P Service" prefix=node
time="2022-07-28 15:58:13" level=info msg="Running node with peer id of 16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD " prefix=p2p
time="2022-07-28 15:58:13" level=debug msg="ECDSA private key generated" prefix=p2p
2022-07-28T15:58:13.650Z    DEBUG   basichost   basic/basic_host.go:313 failed to fetch local IPv6 address  {"error": "no route found for ::"}
time="2022-07-28 15:58:13" level=debug msg="Registering POW Chain Service" prefix=node
2022-07-28T15:58:13.651Z    DEBUG   basichost   basic/basic_host.go:313 failed to fetch local IPv6 address  {"error": "no route found for ::"}
2022-07-28T15:58:13.651Z    DEBUG   basichost   basic/basic_host.go:313 failed to fetch local IPv6 address  {"error": "no route found for ::"}
time="2022-07-28 15:58:13" level=debug msg="Registering Attestation Pool Service" prefix=node
time="2022-07-28 15:58:13" level=debug msg="Registering Determinstic Genesis Service" prefix=node
time="2022-07-28 15:58:13" level=debug msg="Starting Fork Choice" prefix=node
time="2022-07-28 15:58:13" level=debug msg="Registering Blockchain Service" prefix=node
time="2022-07-28 15:58:13" level=info msg="--weak-subjectivity-checkpoint not provided. Prysm recommends providing a weak subjectivity checkpointfor nodes synced from genesis, or manual verification of block and state roots for checkpoint sync nodes." prefix=blockchain
time="2022-07-28 15:58:13" level=debug msg="Registering Intial Sync Service" prefix=node
time="2022-07-28 15:58:13" level=debug msg="Registering Sync Service" prefix=node
time="2022-07-28 15:58:13" level=info msg="Waiting for state to be initialized" prefix=initial-sync
time="2022-07-28 15:58:13" level=debug msg="Registering Slasher Service" prefix=node
time="2022-07-28 15:58:13" level=debug msg="Registering RPC Service" prefix=node
time="2022-07-28 15:58:13" level=info msg="gRPC server listening on port" address="0.0.0.0:4000" prefix=rpc
time="2022-07-28 15:58:13" level=warning msg="You are using an insecure gRPC server. If you are running your beacon node and validator on the same machines, you can ignore this message. If you want to know how to enable secure connections, see: https://docs.prylabs.network/docs/prysm-usage/secure-grpc" prefix=rpc
time="2022-07-28 15:58:13" level=debug msg="Registering GRPC Gateway Service" prefix=node
time="2022-07-28 15:58:13" level=debug msg="Registering Validator Monitoring Service" prefix=node
time="2022-07-28 15:58:13" level=debug msg="Registering Prometheus Service" prefix=node
time="2022-07-28 15:58:13" level=info msg="Starting beacon node" prefix=node version="Prysm/v2.1.3/4de92bafc4bb0051381fba48eb36830aa181d7ee. Built at: 2022-07-06 19:14:48+00:00"
time="2022-07-28 15:58:13" level=debug msg="Starting 9 services: [*p2p.Service *powchain.Service *attestations.Service *blockchain.Service *initialsync.Service *sync.Service *rpc.Service *gateway.Gateway *prometheus.Service]" prefix=registry
time="2022-07-28 15:58:13" level=debug msg="Starting service type *p2p.Service" prefix=registry
time="2022-07-28 15:58:13" level=debug msg="Starting service type *powchain.Service" prefix=registry
time="2022-07-28 15:58:13" level=debug msg="Starting service type *attestations.Service" prefix=registry
time="2022-07-28 15:58:13" level=debug msg="Starting service type *blockchain.Service" prefix=registry
time="2022-07-28 15:58:13" level=debug msg="Starting service type *initialsync.Service" prefix=registry
time="2022-07-28 15:58:13" level=debug msg="Starting service type *sync.Service" prefix=registry
time="2022-07-28 15:58:13" level=debug msg="Starting service type *rpc.Service" prefix=registry
time="2022-07-28 15:58:13" level=debug msg="Starting service type *gateway.Gateway" prefix=registry
time="2022-07-28 15:58:13" level=debug msg="Starting service type *prometheus.Service" prefix=registry
time="2022-07-28 15:58:13" level=info msg="Blockchain data already exists in DB, initializing..." prefix=blockchain
time="2022-07-28 15:58:13" level=debug msg="Starting prometheus service" address="0.0.0.0:8080" prefix=prometheus
time="2022-07-28 15:58:13" level=info msg="Starting API middleware" prefix=gateway
time="2022-07-28 15:58:13" level=error msg="Could not connect to execution endpoint" error="could not make initial request to verify execution chain ID: Post "http://localhost:8545": dial tcp 127.0.0.1:8545: connect: connection refused" prefix=powchain
time="2022-07-28 15:58:13" level=info msg="Starting gRPC gateway" address="127.0.0.1:3500" prefix=gateway
time="2022-07-28 15:58:13" level=debug msg="Received state initialized event" prefix=initial-sync starttime=2021-03-23 14:00:00 +0000 UTC
time="2022-07-28 15:58:13" level=debug msg="Received state initialized event" prefix=sync starttime=2021-03-23 14:00:00 +0000 UTC
time="2022-07-28 15:58:13" level=info msg="Starting initial chain sync..." prefix=initial-sync
time="2022-07-28 15:58:13" level=info msg="Waiting for enough suitable peers before syncing" prefix=initial-sync required=3 suitable=0
time="2022-07-28 15:58:13" level=debug msg="Chain started in sync service" prefix=sync starttime=2021-03-23 14:00:00 +0000 UTC
TRACE[07-28|15:58:14.000] Found seed node in database              id=2066746905d616b2 addr=18.223.219.100:7000 age=460839h58m14.000570125s
INFO [07-28|15:58:14.001] New local node record                    seq=1,659,023,894,000 id=2bafb0e551d5005f ip=10.192.21.133 udp=12000 tcp=13000
TRACE[07-28|15:58:14.001] Found seed node in database              id=b24bac2e191b48e5 addr=135.181.181.239:9000 age=460839h58m14.001983244s
TRACE[07-28|15:58:14.001] Found seed node in database              id=89c01940262d81f2 addr=52.77.252.254:9000   age=460839h58m14.002001822s
TRACE[07-28|15:58:14.002] Found seed node in database              id=e7c268588f3983b9 addr=135.181.182.51:9000  age=460839h58m14.002016556s
time="2022-07-28 15:58:14" level=info msg="Started discovery v5" ENR="enr:-MK4QFQdFj8BZ4baXPAwB_y_HQrFx-5j-UHfkw6L6ZYZ_da2dmpSXShh4QS5XEAJPdw2gj6nrdrLFw3P5RjPlCHLJjeGAYJFh6Xwh2F0dG5ldHOIAAAAAAAAAACEZXRoMpCC9KcrAgAQIP__________gmlkgnY0gmlwhArAFYWJc2VjcDI1NmsxoQIRL3llifwtm_kT8s9KN1yDSRX_9pJj7-k2yuMaJQ4mXohzeW5jbmV0cwCDdGNwgjLIg3VkcIIu4A" prefix=p2p
TRACE[07-28|15:58:14.002] Found seed node in database              id=2066746905d616b2 addr=18.223.219.100:7000  age=460839h58m14.002302651s
TRACE[07-28|15:58:14.002] Found seed node in database              id=b24bac2e191b48e5 addr=135.181.181.239:9000 age=460839h58m14.002325456s
TRACE[07-28|15:58:14.002] Found seed node in database              id=89c01940262d81f2 addr=52.77.252.254:9000   age=460839h58m14.0023406s
TRACE[07-28|15:58:14.002] Found seed node in database              id=e7c268588f3983b9 addr=135.181.182.51:9000  age=460839h58m14.002356679s
TRACE[07-28|15:58:14.002] >> FINDNODE/v5                           id=89c01940262d81f2 addr=52.77.252.254:9000
TRACE[07-28|15:58:14.002] >> FINDNODE/v5                           id=b24bac2e191b48e5 addr=135.181.181.239:9000
TRACE[07-28|15:58:14.002] >> FINDNODE/v5                           id=2066746905d616b2 addr=18.223.219.100:7000
time="2022-07-28 15:58:14" level=info msg="Node started p2p server" multiAddr="/ip4/10.192.21.133/tcp/13000/p2p/16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD" prefix=p2p
2022-07-28T15:58:14.002Z    DEBUG   basichost   basic/basic_host.go:771 host 16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD dialing 16Uiu2HAmPitcpwsGZf1vGiu6hdwZHsVLyFzVZeNqaSmUaSyM7Xvj
2022-07-28T15:58:14.002Z    DEBUG   basichost   basic/basic_host.go:771 host 16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD dialing 16Uiu2HAmTttt9ZTmCmwmKiV3QR7iTAfnAckwzhswrNmWkthi6meB
2022-07-28T15:58:14.002Z    DEBUG   basichost   basic/basic_host.go:771 host 16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD dialing 16Uiu2HAm2quPWmMyE7yaXaW7LSTkUaUpnw95kXaYLkDtipGBkdsT
2022-07-28T15:58:14.002Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/swarm_dial.go:241 dialing peer    {"from": "16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD", "to": "16Uiu2HAmPitcpwsGZf1vGiu6hdwZHsVLyFzVZeNqaSmUaSyM7Xvj"}
2022-07-28T15:58:14.002Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/swarm_dial.go:241 dialing peer    {"from": "16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD", "to": "16Uiu2HAm2quPWmMyE7yaXaW7LSTkUaUpnw95kXaYLkDtipGBkdsT"}
2022-07-28T15:58:14.002Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/swarm_dial.go:241 dialing peer    {"from": "16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD", "to": "16Uiu2HAmTttt9ZTmCmwmKiV3QR7iTAfnAckwzhswrNmWkthi6meB"}
2022-07-28T15:58:14.003Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/limiter.go:193    [limiter] adding a dial job through limiter: /ip4/135.181.182.51/tcp/9000
2022-07-28T15:58:14.003Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/limiter.go:161    [limiter] taking FD token: peer: 16Uiu2HAmTttt9ZTmCmwmKiV3QR7iTAfnAckwzhswrNmWkthi6meB; addr: /ip4/135.181.182.51/tcp/9000; prev consuming: 0
2022-07-28T15:58:14.003Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/limiter.go:167    [limiter] executing dial; peer: 16Uiu2HAmTttt9ZTmCmwmKiV3QR7iTAfnAckwzhswrNmWkthi6meB; addr: /ip4/135.181.182.51/tcp/9000; FD consuming: 1; waiting: 0
2022-07-28T15:58:14.003Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/limiter.go:193    [limiter] adding a dial job through limiter: /ip4/52.77.252.254/tcp/9000
2022-07-28T15:58:14.003Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/limiter.go:161    [limiter] taking FD token: peer: 16Uiu2HAm2quPWmMyE7yaXaW7LSTkUaUpnw95kXaYLkDtipGBkdsT; addr: /ip4/52.77.252.254/tcp/9000; prev consuming: 1
2022-07-28T15:58:14.003Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/limiter.go:167    [limiter] executing dial; peer: 16Uiu2HAm2quPWmMyE7yaXaW7LSTkUaUpnw95kXaYLkDtipGBkdsT; addr: /ip4/52.77.252.254/tcp/9000; FD consuming: 2; waiting: 0
2022-07-28T15:58:14.003Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/limiter.go:193    [limiter] adding a dial job through limiter: /ip4/135.181.181.239/tcp/9000
2022-07-28T15:58:14.003Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/limiter.go:161    [limiter] taking FD token: peer: 16Uiu2HAmPitcpwsGZf1vGiu6hdwZHsVLyFzVZeNqaSmUaSyM7Xvj; addr: /ip4/135.181.181.239/tcp/9000; prev consuming: 2
2022-07-28T15:58:14.003Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/limiter.go:167    [limiter] executing dial; peer: 16Uiu2HAmPitcpwsGZf1vGiu6hdwZHsVLyFzVZeNqaSmUaSyM7Xvj; addr: /ip4/135.181.181.239/tcp/9000; FD consuming: 3; waiting: 0
2022-07-28T15:58:14.003Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/swarm_dial.go:386 16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD swarm dialing 16Uiu2HAmTttt9ZTmCmwmKiV3QR7iTAfnAckwzhswrNmWkthi6meB /ip4/135.181.182.51/tcp/9000
2022-07-28T15:58:14.003Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/swarm_dial.go:386 16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD swarm dialing 16Uiu2HAmPitcpwsGZf1vGiu6hdwZHsVLyFzVZeNqaSmUaSyM7Xvj /ip4/135.181.181.239/tcp/9000
2022-07-28T15:58:14.003Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/swarm_dial.go:386 16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD swarm dialing 16Uiu2HAm2quPWmMyE7yaXaW7LSTkUaUpnw95kXaYLkDtipGBkdsT /ip4/52.77.252.254/tcp/9000
INFO [07-28|15:58:14.003] New local node record                    seq=1,659,023,894,001 id=2bafb0e551d5005f ip=10.192.21.133 udp=12000 tcp=13000
time="2022-07-28 15:58:14" level=trace msg="Fork ENR mismatches between peer and local node" error="fork digest of peer with ENR -KG4QCIzJZTY_fs_2vqWEatJL9RrtnPwDCv-jRBuO5FQ2qBrfJubWOWazri6s9HsyZdu-fRUfEzkebhf1nvO42_FVzwDhGV0aDKQed8EKAAAECD__________4JpZIJ2NIJpcISHtbYziXNlY3AyNTZrMaED4m9AqVs6F32rSCGsjtYcsyfQE2K8nDiGmocUY_iq-TSDdGNwgiMog3VkcIIjKA==: [121 223 4 40], does not match local value: [130 244 167 43]" prefix=p2p
time="2022-07-28 15:58:14" level=trace msg="Fork ENR mismatches between peer and local node" error="fork digest of peer with ENR -LK4QH1xnjotgXwg25IDPjrqRGFnH1ScgNHA3dv1Z8xHCp4uP3N3Jjl_aYv_WIxQRdwZvSukzbwspXZ7JjpldyeVDzMCh2F0dG5ldHOIAAAAAAAAAACEZXRoMpB53wQoAAAQIP__________gmlkgnY0gmlwhIe1te-Jc2VjcDI1NmsxoQOkcGXqbCJYbcClZ3z5f6NWhX_1YPFRYRRWQpJjwSHpVIN0Y3CCIyiDdWRwgiMo: [121 223 4 40], does not match local value: [130 244 167 43]" prefix=p2p
time="2022-07-28 15:58:14" level=trace msg="Fork ENR mismatches between peer and local node" error="fork digest of peer with ENR -LK4QLINdtobGquK7jukLDAKmsrH2ZuHM4k0TklY5jDTD4ZgfxR9weZmo5Jwu81hlKu3qPAvk24xHGBDjYs4o8f1gZ0Bh2F0dG5ldHOIAAAAAAAAAACEZXRoMpB53wQoAAAQIP__________gmlkgnY0gmlwhDRN_P6Jc2VjcDI1NmsxoQJuNujTgsJUHUgVZML3pzrtgNtYg7rQ4K1tkWERgl0DdoN0Y3CCIyiDdWRwgiMo: [121 223 4 40], does not match local value: [130 244 167 43]" prefix=p2p
TRACE[07-28|15:58:14.003] >> FINDNODE/v5                           id=e7c268588f3983b9 addr=135.181.182.51:9000
2022-07-28T15:58:14.108Z    DEBUG   reuseport-transport com_github_libp2p_go_reuseport_transport/reuseport.go:31    failed to reuse port, will try again with a random port: dial tcp4 0.0.0.0:13000->135.181.181.239:9000: connect: connection refused
2022-07-28T15:58:14.213Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/limiter.go:73 [limiter] freeing FD token; waiting: 0; consuming: 3
2022-07-28T15:58:14.213Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/limiter.go:100    [limiter] freeing peer token; peer 16Uiu2HAmPitcpwsGZf1vGiu6hdwZHsVLyFzVZeNqaSmUaSyM7Xvj; addr: /ip4/135.181.181.239/tcp/9000; active for peer: 1; waiting on peer limit: 0
2022-07-28T15:58:14.213Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/swarm_dial.go:266 network for 16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD finished dialing 16Uiu2HAmPitcpwsGZf1vGiu6hdwZHsVLyFzVZeNqaSmUaSyM7Xvj
2022-07-28T15:58:14.213Z    DEBUG   swarm2  com_github_libp2p_go_libp2p_swarm/limiter.go:201    [limiter] clearing all peer dials: 16Uiu2HAmPitcpwsGZf1vGiu6hdwZHsVLyFzVZeNqaSmUaSyM7Xvj
time="2022-07-28 15:58:14" level=trace msg="Could not connect with peer {16Uiu2HAmPitcpwsGZf1vGiu6hdwZHsVLyFzVZeNqaSmUaSyM7Xvj: [/ip4/135.181.181.239/tcp/9000]}" error="failed to dial 16Uiu2HAmPitcpwsGZf1vGiu6hdwZHsVLyFzVZeNqaSmUaSyM7Xvj:
  * [/ip4/135.181.181.239/tcp/9000] dial tcp4 135.181.181.239:9000: connect: connection refused" prefix=p2p
TRACE[07-28|15:58:14.703] FINDNODE failed                          id=2066746905d616b2 failcount=1 dropped=false err="RPC timeout"
TRACE[07-28|15:58:14.703] >> FINDNODE/v5                           id=2066746905d616b2 addr=18.223.219.100:7000
TRACE[07-28|15:58:14.703] FINDNODE failed                          id=e7c268588f3983b9 failcount=1 dropped=false err="RPC timeout"
TRACE[07-28|15:58:14.703] >> FINDNODE/v5                           id=e7c268588f3983b9 addr=135.181.182.51:9000
TRACE[07-28|15:58:14.703] >> FINDNODE/v5                           id=89c01940262d81f2 addr=52.77.252.254:9000
TRACE[07-28|15:58:14.703] FINDNODE failed                          id=89c01940262d81f2 failcount=1 dropped=false err="RPC timeout"
TRACE[07-28|15:58:14.703] >> FINDNODE/v5                           id=b24bac2e191b48e5 addr=135.181.181.239:9000
TRACE[07-28|15:58:14.703] FINDNODE failed                          id=b24bac2e191b48e5 failcount=1 dropped=false err="RPC timeout"
TRACE[07-28|15:58:15.404] FINDNODE failed                          id=b24bac2e191b48e5 failcount=1 dropped=false err="RPC timeout"
TRACE[07-28|15:58:15.404] FINDNODE failed                          id=89c01940262d81f2 failcount=1 dropped=false err="RPC timeout"
TRACE[07-28|15:58:15.404] FINDNODE failed                          id=e7c268588f3983b9 failcount=1 dropped=false err="RPC timeout"
TRACE[07-28|15:58:15.404] FINDNODE failed                          id=2066746905d616b2 failcount=1 dropped=false err="RPC timeout"
time="2022-07-28 15:58:15" level=trace msg="Fork ENR mismatches between peer and local node" error="fork digest of peer with ENR -LK4QLINdtobGquK7jukLDAKmsrH2ZuHM4k0TklY5jDTD4ZgfxR9weZmo5Jwu81hlKu3qPAvk24xHGBDjYs4o8f1gZ0Bh2F0dG5ldHOIAAAAAAAAAACEZXRoMpB53wQoAAAQIP__________gmlkgnY0gmlwhDRN_P6Jc2VjcDI1NmsxoQJuNujTgsJUHUgVZML3pzrtgNtYg7rQ4K1tkWERgl0DdoN0Y3CCIyiDdWRwgiMo: [121 223 4 40], does not match local value: [130 244 167 43]" prefix=p2p
time="2022-07-28 15:58:15" level=trace msg="Fork ENR mismatches between peer and local node" error="fork digest of peer with ENR -LK4QH1xnjotgXwg25IDPjrqRGFnH1ScgNHA3dv1Z8xHCp4uP3N3Jjl_aYv_WIxQRdwZvSukzbwspXZ7JjpldyeVDzMCh2F0dG5ldHOIAAAAAAAAAACEZXRoMpB53wQoAAAQIP__________gmlkgnY0gmlwhIe1te-Jc2VjcDI1NmsxoQOkcGXqbCJYbcClZ3z5f6NWhX_1YPFRYRRWQpJjwSHpVIN0Y3CCIyiDdWRwgiMo: [121 223 4 40], does not match local value: [130 244 167 43]" prefix=p2p
TRACE[07-28|15:58:15.404] >> FINDNODE/v5                           id=b24bac2e191b48e5 addr=135.181.181.239:9000
time="2022-07-28 15:58:15" level=trace msg="Fork ENR mismatches between peer and local node" error="fork digest of peer with ENR -KG4QCIzJZTY_fs_2vqWEatJL9RrtnPwDCv-jRBuO5FQ2qBrfJubWOWazri6s9HsyZdu-fRUfEzkebhf1nvO42_FVzwDhGV0aDKQed8EKAAAECD__________4JpZIJ2NIJpcISHtbYziXNlY3AyNTZrMaED4m9AqVs6F32rSCGsjtYcsyfQE2K8nDiGmocUY_iq-TSDdGNwgiMog3VkcIIjKA==: [121 223 4 40], does not match local value: [130 244 167 43]" prefix=p2p
TRACE[07-28|15:58:15.404] >> FINDNODE/v5                           id=2066746905d616b2 addr=18.223.219.100:7000
TRACE[07-28|15:58:15.404] >> FINDNODE/v5                           id=e7c268588f3983b9 addr=135.181.182.51:9000
TRACE[07-28|15:58:15.404] >> FINDNODE/v5                           id=89c01940262d81f2 addr=52.77.252.254:9000
TRACE[07-28|15:58:16.105] FINDNODE failed                          id=89c01940262d81f2 failcount=2 dropped=false err="RPC timeout"
TRACE[07-28|15:58:16.105] FINDNODE failed                          id=e7c268588f3983b9 failcount=2 dropped=false err="RPC timeout"
TRACE[07-28|15:58:16.105] >> FINDNODE/v5                           id=e7c268588f3983b9 addr=135.181.182.51:9000
TRACE[07-28|15:58:16.105] >> FINDNODE/v5                           id=b24bac2e191b48e5 addr=135.181.181.239:9000
TRACE[07-28|15:58:16.105] FINDNODE failed                          id=2066746905d616b2 failcount=2 dropped=false err="RPC timeout"
TRACE[07-28|15:58:16.105] >> FINDNODE/v5                           id=2066746905d616b2 addr=18.223.219.100:7000
TRACE[07-28|15:58:16.105] >> FINDNODE/v5                           id=89c01940262d81f2 addr=52.77.252.254:9000
TRACE[07-28|15:58:16.105] FINDNODE failed                          id=b24bac2e191b48e5 failcount=2 dropped=false err="RPC timeout"
TRACE[07-28|15:58:16.806] FINDNODE failed                          id=89c01940262d81f2 failcount=3 dropped=false err="RPC timeout"
TRACE[07-28|15:58:16.806] FINDNODE failed                          id=2066746905d616b2 failcount=2 dropped=false err="RPC timeout"
TRACE[07-28|15:58:16.806] FINDNODE failed                          id=b24bac2e191b48e5 failcount=2 dropped=false err="RPC timeout"
TRACE[07-28|15:58:16.806] >> FINDNODE/v5                           id=b24bac2e191b48e5 addr=135.181.181.239:9000
TRACE[07-28|15:58:16.806] >> FINDNODE/v5                           id=89c01940262d81f2 addr=52.77.252.254:9000
TRACE[07-28|15:58:16.806] FINDNODE failed                          id=e7c268588f3983b9 failcount=2 dropped=false err="RPC timeout"
TRACE[07-28|15:58:16.806] >> FINDNODE/v5                           id=e7c268588f3983b9 addr=135.181.182.51:9000

Here are the responses from a couple prysm endpoints

/ # curl http://localhost:3500/eth/v1/beacon/genesis
{"data":{"genesis_time":"1616508000","genesis_validators_root":"0x043db0d9a83813551ee2f33450d23797757d430911a9320530ad8a0eabc43efb","genesis_fork_version":"0x00001020"}}/ # 
/ # 
/ # curl http://localhost:3500/eth/v1/node/peers
{"data":[]}/ # 
/ # 
/ # curl http://localhost:3500/eth/v1/config/spec
{"data":{"ALTAIR_FORK_EPOCH":"36660","ALTAIR_FORK_VERSION":"0x01001020","BASE_REWARD_FACTOR":"64","BELLATRIX_FORK_EPOCH":"18446744073709551615","BELLATRIX_FORK_VERSION":"0x02001020","BLS_WITHDRAWAL_PREFIX":"0x00","CHURN_LIMIT_QUOTIENT":"65536","CONFIG_NAME":"prater","DEPOSIT_CHAIN_ID":"5","DEPOSIT_CONTRACT_ADDRESS":"0xff50ed3d0ec03aC01D4C79aAd74928BFF48a7b2b","DEPOSIT_NETWORK_ID":"5","DOMAIN_AGGREGATE_AND_PROOF":"0x06000000","DOMAIN_APPLICATION_MASK":"0x00000001","DOMAIN_BEACON_ATTESTER":"0x01000000","DOMAIN_BEACON_PROPOSER":"0x00000000","DOMAIN_CONTRIBUTION_AND_PROOF":"0x09000000","DOMAIN_DEPOSIT":"0x03000000","DOMAIN_RANDAO":"0x02000000","DOMAIN_SELECTION_PROOF":"0x05000000","DOMAIN_SYNC_COMMITTEE":"0x07000000","DOMAIN_SYNC_COMMITTEE_SELECTION_PROOF":"0x08000000","DOMAIN_VOLUNTARY_EXIT":"0x04000000","EFFECTIVE_BALANCE_INCREMENT":"1000000000","EJECTION_BALANCE":"16000000000","EPOCHS_PER_ETH1_VOTING_PERIOD":"64","EPOCHS_PER_HISTORICAL_VECTOR":"65536","EPOCHS_PER_RANDOM_SUBNET_SUBSCRIPTION":"256","EPOCHS_PER_SLASHINGS_VECTOR":"8192","EPOCHS_PER_SYNC_COMMITTEE_PERIOD":"256","ETH1_FOLLOW_DISTANCE":"2048","GENESIS_DELAY":"1919188","GENESIS_FORK_VERSION":"0x00001020","HISTORICAL_ROOTS_LIMIT":"16777216","HYSTERESIS_DOWNWARD_MULTIPLIER":"1","HYSTERESIS_QUOTIENT":"4","HYSTERESIS_UPWARD_MULTIPLIER":"5","INACTIVITY_PENALTY_QUOTIENT":"67108864","INACTIVITY_PENALTY_QUOTIENT_ALTAIR":"50331648","INACTIVITY_PENALTY_QUOTIENT_BELLATRIX":"16777216","INACTIVITY_SCORE_BIAS":"4","INACTIVITY_SCORE_RECOVERY_RATE":"16","INTERVALS_PER_SLOT":"3","MAX_ATTESTATIONS":"128","MAX_ATTESTER_SLASHINGS":"2","MAX_COMMITTEES_PER_SLOT":"64","MAX_DEPOSITS":"16","MAX_EFFECTIVE_BALANCE":"32000000000","MAX_PROPOSER_SLASHINGS":"16","MAX_SEED_LOOKAHEAD":"4","MAX_VALIDATORS_PER_COMMITTEE":"2048","MAX_VOLUNTARY_EXITS":"16","MIN_ATTESTATION_INCLUSION_DELAY":"1","MIN_DEPOSIT_AMOUNT":"1000000000","MIN_EPOCHS_TO_INACTIVITY_PENALTY":"4","MIN_GENESIS_ACTIVE_VALIDATOR_COUNT":"16384","MIN_GENESIS_TIME":"1614588812","MIN_PER_EPOCH_CHURN_LIMIT":"4","MIN_SEED_LOOKAHEAD":"1","MIN_SLASHING_PENALTY_QUOTIENT":"128","MIN_SLASHING_PENALTY_QUOTIENT_ALTAIR":"64","MIN_SLASHING_PENALTY_QUOTIENT_BELLATRIX":"32","MIN_SYNC_COMMITTEE_PARTICIPANTS":"1","MIN_VALIDATOR_WITHDRAWABILITY_DELAY":"256","PRESET_BASE":"mainnet","PROPORTIONAL_SLASHING_MULTIPLIER":"1","PROPORTIONAL_SLASHING_MULTIPLIER_ALTAIR":"2","PROPORTIONAL_SLASHING_MULTIPLIER_BELLATRIX":"3","PROPOSER_REWARD_QUOTIENT":"8","PROPOSER_SCORE_BOOST":"40","PROPOSER_WEIGHT":"8","RANDOM_SUBNETS_PER_VALIDATOR":"1","SAFE_SLOTS_TO_IMPORT_OPTIMISTICALLY":"0","SAFE_SLOTS_TO_UPDATE_JUSTIFIED":"8","SECONDS_PER_ETH1_BLOCK":"14","SECONDS_PER_SLOT":"12","SHARDING_FORK_EPOCH":"18446744073709551615","SHARDING_FORK_VERSION":"0x03001020","SHARD_COMMITTEE_PERIOD":"256","SHUFFLE_ROUND_COUNT":"90","SLOTS_PER_EPOCH":"32","SLOTS_PER_HISTORICAL_ROOT":"8192","SYNC_COMMITTEE_SIZE":"512","SYNC_COMMITTEE_SUBNET_COUNT":"4","SYNC_REWARD_WEIGHT":"2","TARGET_AGGREGATORS_PER_COMMITTEE":"16","TARGET_AGGREGATORS_PER_SYNC_SUBCOMMITTEE":"16","TARGET_COMMITTEE_SIZE":"128","TERMINAL_BLOCK_HASH":"0x0000000000000000000000000000000000000000000000000000000000000000","TERMINAL_BLOCK_HASH_ACTIVATION_EPOCH":"18446744073709551615","TERMINAL_TOTAL_DIFFICULTY":"4294967296","TIMELY_HEAD_FLAG_INDEX":"0x02","TIMELY_HEAD_WEIGHT":"14","TIMELY_SOURCE_FLAG_INDEX":"0x00","TIMELY_SOURCE_WEIGHT":"14","TIMELY_TARGET_FLAG_INDEX":"0x01","TIMELY_TARGET_WEIGHT":"26","VALIDATOR_REGISTRY_LIMIT":"1099511627776","WEIGHT_DENOMINATOR":"64","WHISTLEBLOWER_REWARD_QUOTIENT":"512"}}/ # 
/ # 
/ # curl http://localhost:3500/eth/v1/beacon/headers
{"data":[{"root":"0x8c0ebce425ca04612f8a4c9b3d9b339121a62a8fe2baf8ff2c6f77b81194ee87","canonical":false,"header":{"message":{"slot":"0","proposer_index":"0","parent_root":"0x0000000000000000000000000000000000000000000000000000000000000000","state_root":"0x895390e92edc03df7096e9f51e51896e8dbe6e7e838180dadbfd869fdd77a659","body_root":"0xccb62460692be0ec813b56be97f68a82cf57abc102e27bf49ebf4190ff22eedd"},"signature":"0x000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"}}],"execution_optimistic":false}/ # 
/ # 
/ # curl http://localhost:3500/eth/v1/beacon/states/genesis/root
{"data":{"root":"0x895390e92edc03df7096e9f51e51896e8dbe6e7e838180dadbfd869fdd77a659"},"execution_optimistic":false}/ # 
/ # 
/ # curl http://localhost:3500/eth/v1/node/identity
{"data":{"peer_id":"16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD","enr":"enr:-MK4QIrlzfObBbQtK2PjVAxtkhN7UO966scKLRr-m9y1NLMEFholHA2rbj_y2IFLMj3vx8PwH6oaw_UulcNffY7WfXaGAYJFh6Xxh2F0dG5ldHOIAAAAAAAAAACEZXRoMpCC9KcrAgAQIP__________gmlkgnY0gmlwhArAFYWJc2VjcDI1NmsxoQIRL3llifwtm_kT8s9KN1yDSRX_9pJj7-k2yuMaJQ4mXohzeW5jbmV0cwCDdGNwgjLIg3VkcIIu4A==","p2p_addresses":["/ip4/10.192.21.133/tcp/13000/p2p/16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD"],"discovery_addresses":["/ip4/10.192.21.133/udp/12000/p2p/16Uiu2HAkvaktzjwag3aqF9SxQ7vaN5aptRcmqMMLnhDYwcRV5rLD"],"metadata":{"seq_number":"1","attnets":"0x0000000000000000"}}}/ # 
/ # 
/ # curl http://localhost:3500/eth/v1/node/version
{"data":{"version":"Prysm/v2.1.3 (linux amd64)"}}/ # 
/ # 
/ # 
nisdas commented 2 years ago

Assuming you are using the correct genesis file as mentioned above, is port 12000 open ? We use that for discovery via UDP

JustinAvaLabs commented 2 years ago

Is that required to gather outbound peers? We are using the same config in all of our clusters, which contain secure components. Our other clusters were able to acquire an acceptable amount of outbound peers without opening inbound ports

nisdas commented 2 years ago

It isnt required, but it does appear something is blocking outbound udp packets in your cluster. So I am not too sure:

οΏ½[34mTRACEοΏ½[0m[07-28|15:58:16.105] FINDNODE failed                          οΏ½[34midοΏ½[0m=89c01940262d81f2 οΏ½[34mfailcountοΏ½[0m=2 οΏ½[34mdroppedοΏ½[0m=false οΏ½[34merrοΏ½[0m="RPC timeout"
JustinAvaLabs commented 2 years ago

@nisdas I was able to confirm that outbound UDP works fine. There is nothing blocking and test packets went through.
Also, all other outbound traffic is working fine.

Any other suggestions?

JustinAvaLabs commented 2 years ago

If I manually add a long list of peers, I am able to retain a couple and start processing blocks.

time="2022-07-29 18:55:13" level=info msg="Raised fd limit to 1048576 from 1048576"
time="2022-07-29 18:55:13" level=warning msg="Running on the Prater Testnet" prefix=flags
time="2022-07-29 18:55:13" level=info msg="Checking DB" database-path="/root/.eth2/beaconchaindata" prefix=node
time="2022-07-29 18:55:13" level=info msg="Opening Bolt DB at /root/.eth2/beaconchaindata/beaconchain.db" prefix=db
time="2022-07-29 18:55:16" level=info msg="Deposit contract: 0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b" prefix=node
time="2022-07-29 18:55:17" level=info msg="Running node with peer id of 16Uiu2HAmB4H6FBPWdY74kHgZwzjYuExdQ8Z5rihGqxfKAsDTXTY3 " prefix=p2p
time="2022-07-29 18:55:17" level=info msg="--weak-subjectivity-checkpoint not provided. Prysm recommends providing a weak subjectivity checkpointfor nodes synced from genesis, or manual verification of block and state roots for checkpoint sync nodes." prefix=blockchain
time="2022-07-29 18:55:17" level=info msg="Waiting for state to be initialized" prefix=initial-sync
time="2022-07-29 18:55:17" level=info msg="gRPC server listening on port" address="0.0.0.0:4000" prefix=rpc
time="2022-07-29 18:55:17" level=warning msg="You are using an insecure gRPC server. If you are running your beacon node and validator on the same machines, you can ignore this message. If you want to know how to enable secure connections, see: https://docs.prylabs.network/docs/prysm-usage/secure-grpc" prefix=rpc
time="2022-07-29 18:55:17" level=info msg="Starting beacon node" prefix=node version="Prysm/v2.1.3/4de92bafc4bb0051381fba48eb36830aa181d7ee. Built at: 2022-07-06 19:14:48+00:00"
time="2022-07-29 18:55:17" level=info msg="Blockchain data already exists in DB, initializing..." prefix=blockchain
time="2022-07-29 18:55:17" level=info msg="Cleaning up dirty states" count=1 prefix=db
time="2022-07-29 18:55:17" level=info msg="Starting API middleware" prefix=gateway
time="2022-07-29 18:55:17" level=info msg="Starting gRPC gateway" address="127.0.0.1:3500" prefix=gateway
time="2022-07-29 18:55:18" level=info msg="Starting initial chain sync..." prefix=initial-sync
time="2022-07-29 18:55:18" level=info msg="Waiting for enough suitable peers before syncing" prefix=initial-sync required=3 suitable=0
time="2022-07-29 18:55:18" level=info msg="Started discovery v5" ENR="enr:-MK4QOqm_kb4ntJ0NW4wx_ZLkX0KROh5UsuOlWmviBIePh73ZBY8VPP9k9jLjhgKzucKp8zfj8i04cfoXFjFFO5F-HCGAYJLUCBZh2F0dG5ldHOIAAAAAAAAAACEZXRoMpCC9KcrAgAQIP__________gmlkgnY0gmlwhArACw-Jc2VjcDI1NmsxoQLoP-nWMmD9YuiEjjqy5LRt7OW7xgB9TA-5UjlybBcVgIhzeW5jbmV0cwCDdGNwgjLIg3VkcIIu4A" prefix=p2p
time="2022-07-29 18:55:18" level=info msg="Node started p2p server" multiAddr="/ip4/10.192.11.15/tcp/13000/p2p/16Uiu2HAmB4H6FBPWdY74kHgZwzjYuExdQ8Z5rihGqxfKAsDTXTY3" prefix=p2p
2022-07-29T18:55:19.123Z    ERROR   net/identify    identify/id.go:696  16Uiu2HAmB4H6FBPWdY74kHgZwzjYuExdQ8Z5rihGqxfKAsDTXTY3 received key for remote peer 16Uiu2HAm3ohGqGjZVdHnqp2AVTrLkB617J4KQXJf6KF7yrL8r183 mismatch: 16Uiu2HAm3qabKGir2djfGvUUvqsY3o2D9kgD9TMHZuBmrVpvx3uZ
2022-07-29T18:55:19.127Z    ERROR   net/identify    identify/id.go:696  16Uiu2HAmB4H6FBPWdY74kHgZwzjYuExdQ8Z5rihGqxfKAsDTXTY3 received key for remote peer 16Uiu2HAmL9mivATwKZp3anWgeMQZ2fzyLsDbt8wu5cdgrJ43mA8k mismatch: 16Uiu2HAm3qabKGir2djfGvUUvqsY3o2D9kgD9TMHZuBmrVpvx3uZ
2022-07-29T18:55:19.128Z    ERROR   net/identify    identify/id.go:696  16Uiu2HAmB4H6FBPWdY74kHgZwzjYuExdQ8Z5rihGqxfKAsDTXTY3 received key for remote peer 16Uiu2HAm1shsRmqUUQEa35yRYMoyCbRSLes3iQkPMp3imv7cZYQV mismatch: 16Uiu2HAm3qabKGir2djfGvUUvqsY3o2D9kgD9TMHZuBmrVpvx3uZ
2022-07-29T18:55:19.130Z    ERROR   net/identify    identify/id.go:696  16Uiu2HAmB4H6FBPWdY74kHgZwzjYuExdQ8Z5rihGqxfKAsDTXTY3 received key for remote peer 16Uiu2HAmHzwXQCJco3feL8iRFc7hdSmC1f4GdieMZANdVNxBLan5 mismatch: 16Uiu2HAm3qabKGir2djfGvUUvqsY3o2D9kgD9TMHZuBmrVpvx3uZ
2022-07-29T18:55:19.133Z    ERROR   net/identify    identify/id.go:696  16Uiu2HAmB4H6FBPWdY74kHgZwzjYuExdQ8Z5rihGqxfKAsDTXTY3 received key for remote peer 16Uiu2HAmK8Lnm1T4t6CdySsSDTDsWYjpz65H3RZur54vzmB62zdp mismatch: 16Uiu2HAm3qabKGir2djfGvUUvqsY3o2D9kgD9TMHZuBmrVpvx3uZ
2022-07-29T18:55:19.134Z    ERROR   net/identify    identify/id.go:696  16Uiu2HAmB4H6FBPWdY74kHgZwzjYuExdQ8Z5rihGqxfKAsDTXTY3 received key for remote peer 16Uiu2HAm33BSCo5Lq6X58RrqQiT8JmKmFq2XHuURgRJVwuQUiji7 mismatch: 16Uiu2HAm3qabKGir2djfGvUUvqsY3o2D9kgD9TMHZuBmrVpvx3uZ
2022-07-29T18:55:19.135Z    ERROR   net/identify    identify/id.go:696  16Uiu2HAmB4H6FBPWdY74kHgZwzjYuExdQ8Z5rihGqxfKAsDTXTY3 received key for remote peer 16Uiu2HAmHaD5LLrDGYjW3fZ9QdpYpKYVrVA4ViG5VFzu3TRiN8AJ mismatch: 16Uiu2HAm3qabKGir2djfGvUUvqsY3o2D9kgD9TMHZuBmrVpvx3uZ
2022-07-29T18:55:19.135Z    ERROR   net/identify    identify/id.go:696  16Uiu2HAmB4H6FBPWdY74kHgZwzjYuExdQ8Z5rihGqxfKAsDTXTY3 received key for remote peer 16Uiu2HAm6JaCmYNqphGjUKXWCPLsiRYFJkryLHiU4t8nrY2UuoTk mismatch: 16Uiu2HAm3qabKGir2djfGvUUvqsY3o2D9kgD9TMHZuBmrVpvx3uZ
2022-07-29T18:55:19.137Z    ERROR   net/identify    identify/id.go:696  16Uiu2HAmB4H6FBPWdY74kHgZwzjYuExdQ8Z5rihGqxfKAsDTXTY3 received key for remote peer 16Uiu2HAm9DMoorfHJGni7jxhH5yPcJaYfrg4YDi8vori2aFNzNnk mismatch: 16Uiu2HAm3qabKGir2djfGvUUvqsY3o2D9kgD9TMHZuBmrVpvx3uZ
time="2022-07-29 18:55:26" level=info msg="Processing block batch of size 29 starting from  0xaacde822... 3937/3551077 - estimated time remaining 679h31m43s" blocksPerSecond=1.4 peers=18 prefix=initial-sync
time="2022-07-29 18:55:32" level=info msg="Connected to new endpoint: http://localhost:8545" prefix=powchain
time="2022-07-29 18:55:47" level=info msg="Processing block batch of size 61 starting from  0x2470e8ef... 3968/3551078 - estimated time remaining 323h3m6s" blocksPerSecond=3.0 peers=7 prefix=initial-sync
time="2022-07-29 18:55:55" level=info msg="Processing block batch of size 59 starting from  0x81318a30... 4032/3551079 - estimated time remaining 164h12m54s" blocksPerSecond=6.0 peers=7 prefix=initial-sync
time="2022-07-29 18:56:01" level=info msg="Processing block batch of size 61 starting from  0x6c2cdfec... 4097/3551080 - estimated time remaining 108h52m11s" blocksPerSecond=9.1 peers=7 prefix=initial-sync
time="2022-07-29 18:56:08" level=info msg="Processing block batch of size 62 starting from  0x918de32c... 4160/3551080 - estimated time remaining 108h16m11s" blocksPerSecond=9.1 peers=6 prefix=initial-sync
time="2022-07-29 18:56:14" level=info msg="Saved state in DB" prefix=state-gen root=b02bcc109576 slot=4095
time="2022-07-29 18:56:14" level=info msg="Processing block batch of size 59 starting from  0x75427693... 4224/3551081 - estimated time remaining 81h45m44s" blocksPerSecond=12.1 peers=6 prefix=initial-sync
time="2022-07-29 18:56:18" level=info msg="Peer summary" activePeers=6 inbound=0 outbound=6 prefix=p2p
time="2022-07-29 18:56:21" level=info msg="Processing block batch of size 62 starting from  0x41e3b3d1... 4288/3551081 - estimated time remaining 107h40m27s" blocksPerSecond=9.2 peers=6 prefix=initial-sync
time="2022-07-29 18:56:27" level=info msg="Processing block batch of size 62 starting from  0xfde998c9... 4352/3551082 - estimated time remaining 80h25m28s" blocksPerSecond=12.2 peers=6 prefix=initial-sync
time="2022-07-29 18:56:33" level=info msg="Processing block batch of size 61 starting from  0x6e7b67c5... 4416/3551082 - estimated time remaining 80h45m10s" blocksPerSecond=12.2 peers=6 prefix=initial-sync
time="2022-07-29 18:56:39" level=info msg="Processing block batch of size 60 starting from  0x570db0e6... 4481/3551083 - estimated time remaining 80h25m18s" blocksPerSecond=12.2 peers=6 prefix=initial-sync
time="2022-07-29 18:56:46" level=info msg="Processing block batch of size 63 starting from  0x8ef18f44... 4544/3551083 - estimated time remaining 80h5m36s" blocksPerSecond=12.3 peers=6 prefix=initial-sync
time="2022-07-29 18:56:52" level=info msg="Processing block batch of size 61 starting from  0x0a33c167... 4608/3551084 - estimated time remaining 80h25m8s" blocksPerSecond=12.2 peers=6 prefix=initial-sync
time="2022-07-29 18:56:58" level=info msg="Processing block batch of size 63 starting from  0x3bdf6e3b... 4672/3551084 - estimated time remaining 79h45m58s" blocksPerSecond=12.3 peers=6 prefix=initial-sync
time="2022-07-29 18:57:04" level=info msg="Processing block batch of size 61 starting from  0x2ab80240... 4737/3551085 - estimated time remaining 79h26m35s" blocksPerSecond=12.4 peers=6 prefix=initial-sync
time="2022-07-29 18:57:10" level=info msg="Processing block batch of size 61 starting from  0x054ed6c5... 4800/3551085 - estimated time remaining 80h5m15s" blocksPerSecond=12.3 peers=6 prefix=initial-sync
time="2022-07-29 18:57:16" level=info msg="Processing block batch of size 60 starting from  0x0947ecd1... 4864/3551086 - estimated time remaining 80h24m47s" blocksPerSecond=12.2 peers=6 prefix=initial-sync
time="2022-07-29 18:57:18" level=info msg="Peer summary" activePeers=6 inbound=0 outbound=6 prefix=p2p
time="2022-07-29 18:57:22" level=info msg="Processing block batch of size 62 starting from  0x254e0f47... 4928/3551086 - estimated time remaining 80h44m28s" blocksPerSecond=12.2 peers=6 prefix=initial-sync
time="2022-07-29 18:57:28" level=info msg="Processing block batch of size 63 starting from  0xc38e8a60... 4992/3551087 - estimated time remaining 80h5m0s" blocksPerSecond=12.3 peers=6 prefix=initial-sync
time="2022-07-29 18:57:35" level=info msg="Processing block batch of size 59 starting from  0x3ab064db... 5056/3551087 - estimated time remaining 80h44m18s" blocksPerSecond=12.2 peers=6 prefix=initial-sync
JustinAvaLabs commented 2 years ago

I worry that these peers will drop over time and the node will not be able to recover. Any ideas on what might be causing the peer discovery to fail in this way?

JustinAvaLabs commented 2 years ago

As suspected, the peers dropped off one by one and eventually the node stopped bootstrapping because of a lack of suitable peers.

nisdas commented 2 years ago

The same helm chart/config works fine in 3 of my other clusters, but constantly fails in one.

I am not too sure, the fact that your 3 other clusters work might point to some specific issues with firewalls in the current cluster. How did you test the UDP networking ? Were you able to make a roundtrip connection with a remote peer ? Ex:

cluster --> remote peer ( successfully sent outbound packet) 
cluster <-- remote peer ( successfully received response packet)

The error logs do indicate that it isn't able to get a response back from the bootnodes.

00ssh commented 2 years ago

I am getting the same repeating pattern in the logs. It does not seem to surpass 2 peers in the last 2 hours. I'm trying Goerli.


time="2022-09-06 09:50:28" level=info msg="Waiting for enough suitable peers before syncing" prefix=initial-sync required=3 suitable=0
time="2022-09-06 09:50:30" level=warning msg="Unable to cache headers for execution client votes" error="(1662375902 > 1660182298): provided time is later than the current eth1 head" prefix=powchain
time="2022-09-06 09:50:33" level=info msg="Waiting for enough suitable peers before syncing" prefix=initial-sync required=3 suitable=0
time="2022-09-06 09:50:38" level=info msg="Waiting for enough suitable peers before syncing" prefix=initial-sync required=3 suitable=0
time="2022-09-06 09:50:43" level=info msg="Waiting for enough suitable peers before syncing" prefix=initial-sync required=3 suitable=0```
web3d3v commented 1 year ago

I'm on arch linux, latest prysm, latest geth.

Managed to sync up to latest about a week ago, afterwards gracefully shutdown. Since then for a week:

Waiting for enough suitable peers before syncing required=3 suitable=0

Increasing --p2p-max-peers did not solve the issue. Deleting .eth2 and trying to start over did not solve the issue either.

nisdas commented 1 year ago

For those stuck on this, either you are running for goerli incorrectly. The correct way is detailed here: https://docs.prylabs.network/docs/install/install-with-bazel#run-a-beacon-node

If not, it would most likely point to an issue with your firewall. If you have verified the above both are fine and you are still having issues, feel free to ping this issue and I will reopen it. If not, I am closing the issue for now.