Open WietzeSlagman opened 2 months ago
Related logs for cases where clients stop syncing after a time, in this case 2 clients stopped at the same time. A restart does trigger them to start syncing again, however after less than 500 blocks they stop syncing again, requiring a new restart.
another case where clients have stopped syncing below with logs below:
This took place during the "900 clients connecting to the fringe clients, in a decentralized topology." test. Several clients appear to be quite a distance away from tip, potentially due to "maximum peers reached" but those peers are not synced to tip or some issue related to that. I have included logs of two of our clients that have gotten stopped up in the 260-280k block range on canary net.
Note: I only included 1 days' worth of logs but happy to include more if helpful.
client2-canary-may15.log client-canary-may15.log
snippet from where we initially stalled:
May 13 18:01:02 client-nodes-1 snarkos[3894356]: 2024-05-13T18:01:02.508529Z TRACE snarkos_node_sync::block_sync: No block requests to send - try advancing with block responses (at block 258945)
May 13 18:01:05 client-nodes-1 snarkos[3894356]: 2024-05-13T18:01:05.392179Z DEBUG snarkos_node_router::heartbeat: Connected to 21 peers [34.71.154.32:4136, 34.16.96.117:4134, 34.171.188.136:4138, 35.196.17.175:4130, 34.133.96.97:4131, 34.30.43.173:4134, 34.134.47.103:4139, 104.155.187.104:4132, 34.121.237.207:4130, 34.28.213.8:4139, 35.202.22.233:4136, 35.184.224.202:4136, 35.192.210.26:4139, 34.27.166.56:4135, 35.223.204.80:4136, 34.28.213.8:4131, 34.134.226.163:4130, 35.226.159.181:4130, 34.67.141.212:4130, 34.133.96.97:4136, 209.97.156.21:4130]
May 13 18:01:05 client-nodes-1 snarkos[3894356]: 2024-05-13T18:01:05.392269Z INFO snarkos_node_router::heartbeat: Disconnecting from '209.97.156.21:4130' (periodic refresh of peers)
May 13 18:01:05 client-nodes-1 snarkos[3894356]: 2024-05-13T18:01:05.392317Z WARN snarkos_node_router: Dropping connection attempt to '34.74.95.84:4130' (maximum peers reached)
Environment:
Running snarkOS commit: [https://github.com/AleoNet/anf-snarkOS/commit/fc340c679960e63612c536d69e71405b77e113f4] rustc 1.77.2 (25ef9e3d8 2024-04-09) Ubuntu 22.04.4 LTS
🐛 Bug Report
A client node running on canary occasionally stops syncing for a long while and does not get up to tip, it does not receive any new blocks from the validator it is connected too and only sees its own height as the tip.
Steps to Reproduce
Unclear on how to exaclty reproduce this, however the following logs get shown on the client where it does not receive any new blocks as it sees itself as the highest height.
A potential related element of this could be that it is refusing connections/disconnecting from the connected validator and unable to make that connection.
Expected Behavior
Clients to keep syncing to the latest tip and keep in sync with their connected peers (validators and clients).
Your Environment