threefoldtech / zos

Autonomous operating system
https://threefold.io/host/
Apache License 2.0
84 stars 14 forks source link

Zos did not retry across the full set of tfchain nodes when one becomes unreachable. #2451

Open mahendravarmayadala93 opened 3 weeks ago

mahendravarmayadala93 commented 3 weeks ago

We're noticing some cases where nodes are failing to submit uptime reports, apparently due to the current outage of tf chain node 04.tfchain.grid.tf (which is known and is being addressed). The node logs show that the node is repeatedly trying to contact the IP of that one tfchain node and not trying the others. Two examples are shown below.

Node 5471

6125380081955224890

Node 493

4972078876469603730

Cameron1986labeyrie commented 3 weeks ago

Node ID: 7195

image