Closed ping-ke closed 4 months ago
the peer will remove or recover after some time
Why is the peer removed or recovered after some time? Could you please provide more explanation?
I am still a little concerned because we haven’t identified the root cause of addresses disappearing in the peer store. If we change this error to the debug level, we will essentially lose visibility of the issue.
Change the log level to debug for the error caused by newStream, this should be a temporary issue between the local node and the remote peer, I have seen this issue twice (one reported by a customer and one by me - the network condition at that time was very bad.)
This issue will be resolved some times later when
- When the peer is removed
- Addr is added back to Peerstore in the discovery.go
According to the discussion in today’s meeting, I think we may need to set up a test environment to determine if the root cause is an outdated TTL.
Added log to removeExpired in addr_book.go, and will see it clear the addresses after expired. and connect to this peer will throw "no addresses" issue.
Added log to removeExpired in addr_book.go, and will see it clear the addresses after expired. and connect to this peer will throw "no addresses" issue.
Then should we also need to remove the peer in syncClient when the peerstore removes the addr?
Need to update the PR title & resolve the conflicts
issue 262: https://github.com/ethstorage/es-node/issues/262
Root Cause When creating a newStream to send a request to a peer, if the connection does not exist, it will call the peer again. But if the address of that peer does not exist, this issue will be thrown. The peer address does not exist because of time-to-live timeout, which is 24 hours correctly. time-to-live timeout means the peer not be added back to peerstore by discovery.
pstore.AddAddrs(info.ID, info.Addrs, discoveredAddrTTL) // discoveredAddrTTL = time.Hour * 24
How to fix it Change the log level to debug for the error caused by newStream "no addresses" and close the peer.