probe-lab / hermes

A Gossipsub listener and tracer.
Other
10 stars 5 forks source link

Goodbye handler #25

Closed namn-grg closed 1 month ago

namn-grg commented 1 month ago

I have some questions that I would appreciate help with -

cortze commented 1 month ago

I'm unsure of understanding this correctly.

I noticed in the logs that after receiving goodbye from a peer, we keep dialling the peer again, whats the reason? Should we not kill/remove it after receiving goodbye once.

There are two scenarios that I see here, but I'm not sure which one are you refering to:

  1. Even though the remote peer sends us a Goodbye message, drops the connection, and we end up removing the peer from the pool of active connections, we can still contact back that same peer after a given time (after a Grace period to avoid spamming it). Connections are often dropped because the remote peer has "too many connections" open already, but this doesn't stop us from contacting it again. NOTE: if the remote peer sends a disconnection RPC to us (at the libp2p level), that will also close the connection and remove it from our pool of connected nodes. As far as I know, there is no need to do it manually.

  2. As in any other client written in Go, there are many go-routines simultaneously "communicating" with a remote peer, especially at the beginning of a connection when performing the handshake. This could mean that, although we receive the Goodbye message, we still have other ongoing RPCs. Thus, the received logs right after the Goodbye might just be the logs of RPCs that didn't know the remote peer sent us a Goodbye message (yet).

After we have successfully received the metadata response, what do we do with the peer? Ideally, we should remove it.

Coming from your previous example, we only close and remove the connection with a peer if the handshake goes wrong. A wrong handshake means that the remote peer closed/dropped the stream, doesn't belong to the same network, or isn't of interest.

In the other case, where the metadata is successfully received, we should actually keep it within our pool of connected nodes. As it can be beneficial to keep it ready whenever we need peers on a given mesh. The behaviour that you describe gets closer to what a network crawler would like to do, as it wouldn't like to "waste" resources on keeping connections open but trying to connect and identify more nodes in the network.

Let me know if I'm getting this correctly, I'm taking a cup of coffee as I go through this :)

namn-grg commented 1 month ago

Thanks @cortze ! This was very helpful and cleared up my confusion, thanks!