radixdlt / olympia-node

Radix monorepo
Other
142 stars 35 forks source link

NT-60: improve handling of failed connections #597

Closed siy closed 2 years ago

siy commented 2 years ago

Changed approach to handling of failed peer connection. The algorithm looks as follows:

With every failed connection attempt, peer is ignored for consequently increased time interval. Initial delay is set to 1 and then doubled with failed connection. Once the calculated interval exceeds 1hr, peer is completely removed from the address book.

This approach provides enough time for the failed node to restore (also about 1hr), but if the issue persists, the peer is removed, so there are no stall records in the address book. This should eliminate issues with failed peers, which are restored functioning with another IP address.

Known issue:

sonarcloud[bot] commented 2 years ago

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

88.2% 88.2% Coverage
0.0% 0.0% Duplication