Open raucao opened 8 months ago
From what I can see in the logs, your node struggles to discover network peers (other than the bootstrap ones, a.k.a. Bootstrap-Only nodes). In testnet
the bootstrap nodes provide only one service - discovery of other peers - that's why your node wasn't able to fetch blocks and start syncing without being connected to at least one "full" node.
The reason why your node wasn't able to find (discover) full nodes in the network might be related to the fact that so-called "buckets" in an internal table of boot nodes that correspond to your nodeId
were already filled up with other peers. Usually this is a rare case, but sometimes it happens. You can find more info on how the discovery protocol works, for instance, in here.
We are now working on improving the UX of initial bootstrapping process in Testnet
, eg. recently we increased the number of boot nodes in this PR, but that hasn't been released yet.
What you can do by now is following:
nodeId
. The simplest way is to remove (and probably backup, if needed) <databaseDir>/nodeId.properties
file, so that your node will automatically generate a new one on next start. With this there's a good chance that at least one boot node would have a non-full "bucket" that corresponds to your new nodeId
, which would allow your node to find other full nodes in the networkThank you for the explanation.
I have tried all of the suggested mitigations, but without luck so far. I had already restarted the node multiple times before, because I remembered how that seemed to fix sync not starting in the past.
I do still have another testnet node running on a different machine, and also added that to the bootstrap list. I do get a couple more IP addresses that aren't the normal bootstrap nodes now, but still no sync. Is there a way that I can tell my own nodes to immediately sync with each other perhaps? They are both on the same private network, so maybe I could even add a rule for prioritized networks or something?
yes, you can specify nodes that you trust or that you want to connect to in your config file - this should also help with bootstrapping / starting a sync. Check the config sections out
Great! Adding the new node to the existing node's trusted list, and connecting by default to it from the new node solved my issue. Thank you!
The reported original issue still exists when you do not have a trusted node to connect to, but just want to start initial sync via the normal discovery process. So I think this should be kept open until it's been confirmed to work reliably for new users.
I set up two new VMs containing one testnet and one mainnet node on a new host. The mainnet node started initial sync normally after a short while. However, the testnet node has not started syncing a single block after 2 weeks of trying.
The logs will occasionally report bootstrap peers having been found, but there are no errors or warnings reporting any issues with sync.
How can I debug why sync is not starting?