base-org / node

Everything required to run your own Base node
MIT License
68.21k stars 2.53k forks source link

Increase global peer coverage #222

Open TangMonk opened 6 months ago

TangMonk commented 6 months ago

It's hard to looking peers:

base-geth-1  | INFO [03-25|07:56:33.173] Looking for peers                        peercount=0 tried=78  static=0
base-geth-1  | INFO [03-25|07:56:53.511] Looking for peers                        peercount=1 tried=79  static=0
base-geth-1  | INFO [03-25|07:57:03.647] Looking for peers                        peercount=0 tried=148 static=0
base-geth-1  | INFO [03-25|07:57:13.685] Looking for peers                        peercount=0 tried=114 static=0
base-geth-1  | INFO [03-25|07:57:23.717] Looking for peers                        peercount=0 tried=138 static=0
base-geth-1  | INFO [03-25|07:57:33.748] Looking for peers                        peercount=0 tried=92  static=0
base-geth-1  | INFO [03-25|07:57:43.799] Looking for peers                        peercount=0 tried=175 static=0
base-geth-1  | INFO [03-25|07:57:53.872] Looking for peers                        peercount=0 tried=139 static=0
base-geth-1  | INFO [03-25|07:58:04.108] Looking for peers                        peercount=0 tried=105 static=0
base-geth-1  | INFO [03-25|07:58:14.178] Looking for peers                        peercount=0 tried=117 static=0
base-geth-1  | INFO [03-25|07:58:24.263] Looking for peers                        peercount=0 tried=138 static=0
base-geth-1  | INFO [03-25|07:58:34.396] Looking for peers                        peercount=0 tried=115 static=0
base-geth-1  | INFO [03-25|07:58:44.417] Looking for peers                        peercount=0 tried=104 static=0
base-geth-1  | INFO [03-25|07:58:54.432] Looking for peers                        peercount=0 tried=89  static=0
base-geth-1  | INFO [03-25|07:59:04.459] Looking for peers                        peercount=0 tried=102 static=0
base-geth-1  | INFO [03-25|07:59:14.540] Looking for peers                        peercount=0 tried=145 static=0
base-geth-1  | INFO [03-25|07:59:24.541] Looking for peers                        peercount=0 tried=72  static=0

Very long time, still 0 peer

mariokami commented 6 months ago

I'm experiencing the same issue. I haven't dug into the architecture of the project (apologies) but it seems that 9222 tcp/udp are the discovery ports for p2p.

I've opened these up on the node but I'm still experiencing the same issue as @TangMonk - long periods of 0 peers.

roberto-bayardo commented 6 months ago

Thanks, will take a look. We set up some default public peers but perhaps they don't have enough capacity.

As for the question in the title of this issue, yes we have the static peers that should be provided by the bootnodes in the default config here. This is a recent update, so make sure your config has this too:

https://github.com/base-org/node/blob/18a9591d2b06ae90885d450e824c75ccd6d8582c/.env.mainnet#L3

wbnns commented 6 months ago

@TangMonk @mariokami Can you all also please make sure TCP / UDP for P2P is forwarded for 30303 and let us know the results?

TangMonk commented 6 months ago

@TangMonk @mariokami Can you all also please make sure TCP / UDP for P2P is forwarded for 30303 and let us know the results?

$ sudo docker ps
CONTAINER ID   IMAGE                         COMMAND                  CREATED             STATUS                 PORTS                                                                                                                                                                                                                   NAMES
2c85eb19d42b   base-node                     "bash ./op-node-entr…"   4 hours ago         Up 4 hours             0.0.0.0:6060->6060/tcp, :::6060->6060/tcp, 0.0.0.0:7300->7300/tcp, :::7300->7300/tcp, 0.0.0.0:9222->9222/tcp, :::9222->9222/tcp, 0.0.0.0:9222->9222/udp, :::9222->9222/udp, 0.0.0.0:7545->8545/tcp, :::7545->8545/tcp   base-node-1
add7fe3cae97   base-geth                     "bash ./geth-entrypo…"   4 hours ago         Up About a minute      0.0.0.0:8545-8546->8545-8546/tcp, :::8545-8546->8545-8546/tcp, 0.0.0.0:30303->30303/tcp, :::30303->30303/tcp, 0.0.0.0:30303->30303/udp, :::30303->30303/udp, 0.0.0.0:7301->6060/tcp, :::7301->6060/tcp                  base-geth-1

and no firewall enabled on my machine

TangMonk commented 6 months ago

Thanks, will take a look. We set up some default public peers but perhaps they don't have enough capacity.

As for the question in the title of this issue, yes we have the static peers that should be provided by the bootnodes in the default config here. This is a recent update, so make sure your config has this too:

https://github.com/base-org/node/blob/18a9591d2b06ae90885d450e824c75ccd6d8582c/.env.mainnet#L3

All above of your provided static peers node are from USA. Not convenient for people from other areas

Could u please add some Asian static peers node, Hong Kong, Singapore, Japan is fine. I am living in China, Syncing node is so sloooow.

wbnns commented 6 months ago

Thanks for the feedback; leaving this issue open to track the need for increasing global peer coverage to facilitate easier discovery.

mariokami commented 6 months ago

@wbnns thanks for the response - 30303 is forwarded on this box, and I'll keep you posted with updates. I wasn't thinking about 30303 because it says in the docker-compose.yml file :

      - 30403:30303     # P2P TCP (currently unused)
      - 30403:30303/udp # P2P UDP (currently unused)

@TangMonk the bootnodes are just for discovery (ie: to provide addresses for other nodes via Kademlia DHT) so their geographic location shouldn't make too much of a difference. I'd be interested if there was something to suggest otherwise.

rcastellaj commented 6 months ago

Is it 30303 (conflicting with Ethereum) or is it 30301 ? The static bootnodes list has 30301 for all their ports...

parsdextra commented 6 months ago

Are the boot nodes still overloaded? Same issue here with no connected nodes, no firewall.

rcastellaj commented 6 months ago

Took about 1 hour for my node to find one peer, after that the whole chain sync took 1 hour as my L1 beacon and exec are local to the infrastructure.

To be fair it kind of feels like you dont need more than 1 peer to start syncing. Would it be too hard to provide a reliable list of 10 or so static nodes?

lmcapp commented 5 months ago

image My server is in Germany, and all provided static nodes cannot be pinged. Looking for a peer for more than a day and the result is still 0.

mariokami commented 5 months ago

The boot nodes don't (shouldn't) participate in peering iirc - they just provide pointers to additional peers via Kademlia DHT lookups.

If anyone has a moment and is interested there is Nebula which can crawl DHT networks for analysis. It would be interesting to see what the properties of the Base network are like.

roberto-bayardo commented 5 months ago

We do provide a set of nodes available for peering, but apparently not enough. We will look into increasing capacity.

smowden commented 4 months ago

been running for days without being able to find a single peer

pikansj0s commented 4 months ago

We do provide a set of nodes available for peering, but apparently not enough. We will look into increasing capacity.

So.... any new peers?

roberto-bayardo commented 3 months ago

We cranked up our snapshot peering this week, let me know if you still have issues.

MindlessSteel commented 3 months ago

I thought this was the basic goal of base eth.

rcastellaj commented 3 months ago

We cranked up our snapshot peering this week, let me know if you still have issues.

Yes, it's still an issue. I've been trying to resync after updating to .4 for about 4 days now. Its stuck at peercount=2 but makes no progress.

roberto-bayardo commented 3 months ago

We have added one more machine with additional 500 peer handling capacity, though I am not sure if that will help, as the issue may be discovery related. Still digging into this.

One thing I found that helps with peer connectivity is to make sure inbound connections to 30303 are open, and you specify your external IP address appropriately (geth --nat=extip:[your external ip address]). Is that something your setup would allow?

rcastellaj commented 3 months ago

We have added one more machine with additional 500 peer handling capacity, though I am not sure if that will help, as the issue may be discovery related. Still digging into this.

One thing I found that helps with peer connectivity is to make sure inbound connections to 30303 are open, and you specify your external IP address appropriately (geth --nat=extip:[your external ip address]). Is that something your setup would allow?

Wait, why 30303 ? That's Ethereum.

roberto-bayardo commented 3 months ago

right op-geth is just a fork of geth, so it defaults to 30303 for p2p. You can override it to whatever port you prefer with the --port flag however.

rcastellaj commented 3 months ago

now im in a situation where my chain finds 1-2 peers quick enough but it never syncs. just stays in looking for peers. why?

rcastellaj commented 3 months ago
INFO [06-17|11:44:14.386] Looking for peers                        peercount=4 tried=95  static=0
INFO [06-17|11:44:24.467] Looking for peers                        peercount=4 tried=167 static=0
INFO [06-17|11:44:34.568] Looking for peers                        peercount=4 tried=160 static=0
INFO [06-17|11:44:44.766] Looking for peers                        peercount=4 tried=115 static=0
INFO [06-17|11:44:54.796] Looking for peers                        peercount=4 tried=156 static=0
INFO [06-17|11:45:04.845] Looking for peers                        peercount=4 tried=133 static=0

I'm here, and it's not syncing. Doesn't matter if I start with an empty data dir or with a snapshot. Using the 0.8.4 release.

roberto-bayardo commented 3 months ago

Can you try opening inbound connections to port 30303 and specifying the HOST_IP appropriately?

rcastellaj commented 3 months ago

Both are done and it's not starting the sync

roberto-bayardo commented 3 months ago

Can you confirm you can "telnet 30303" (or whatever port you're using) to the node from an outside machine? I am having trouble replicating so wondering if your port is still blocked somehow. We're still working on allowing more peers to be found even if the port is blocked. Thanks for your patience.

rcastellaj commented 3 months ago

I can confirm, yes. It's a kube cluster but the NodePort 30303 is being proxied through to the box, and the HOST_IP indeed can be telnet'd into 30303 from an external machine, and it arrives at the right pod.

Also I dont have issues finding peers. It's just not starting the sync.

rcastellaj commented 3 months ago

I have 2 peers already within 2 minutes of restarting. Now running 0.8.3 I think. Same old.

roberto-bayardo commented 3 months ago

Would you mind sharing your enode id? (Feel free to e-mail...) I can see if I can peer my node with it and see if that gets it syncing.

rcastellaj commented 3 months ago

I also have 100+ peers connected now but it's not trying to catch up the chain. Can you advise of current configuration that might make this weird?

roberto-bayardo commented 3 months ago

Interesting so now peering is better but still no sync. assuming you already uncomment the flags here? https://github.com/base-org/node/blob/2dbf4a4eba04970c626a32f4c17cadc9c3a2178f/.env.mainnet#L41

Is your config otherwise standard?

rcastellaj commented 3 months ago

Ah, no, I didn't realize there were new envs there, and we are supplying them via a cm so didn't pick up the file automatically. Will try with that and report back. Thanks.

rcastellaj commented 3 months ago

@roberto-bayardo yea, even with 0.9.0, the updated env, testing changing between local L1 rpc/beacon to a chainstack archive one, it still doesn't start syncing.

data dir is wiped, it says starting snap sync mode and that the database is empty, but up to 10 nodes now and it doesn't start any sync -- what am I missing? :(

walkerlala commented 3 months ago

I am having the same issue here. Long period of 'peercount=0' . Already changed the 'HOST_IP' for NAT (though I have a public ip) and added bootnodes.

walkerlala commented 3 months ago

Can anyone share more bootnodes address here?

roberto-bayardo commented 3 months ago

Can anyone share more bootnodes address here?

https://github.com/base-org/node/blob/8e7460e58d51748bf0580e715863ccac589a547f/.env.mainnet#L45

ssnickolay commented 1 month ago

I had the same issue and solved it by deleting everything from geth folder after unzipping the snapshot archive (except chaindata folder)

root@full-2tb ~/base/blockchain # ls geth
chaindata  LOCK  nodekey  nodes  transactions.rl

I believe the root issue is nodekey which is the same for all of the users who downloaded the official snapshot => peers rejected connection since the key already exists in the peer network.

If you have trouble with downloading the state (after reaching some peers) please check local optimism node logs (op-node); In my case, I forgot to change OP_NODE_L1_ETH_RPC and quickly reached the limits:

lvl=warn msg="Engine temporary error" err="temp: failed to find L1 block info by number, at origin 0xe125aab359435b0f224d3ab6f9603201aae967c95821bd9b715cedb8f13349b5:20625586 next 20625587: failed to fetch header by num 20625587: Exceeded the quota usage"