libp2p / go-libp2p-pubsub

The PubSub implementation for go-libp2p
https://github.com/libp2p/specs/tree/master/pubsub
Other
321 stars 185 forks source link

Gossipsub backoff latencies may result in punishment #368

Closed blacktemplar closed 4 years ago

blacktemplar commented 4 years ago

Consider the following scenario:

  1. Peer A sends a PRUNE to peer B with a backoff time of 60 seconds. It stores the 60 second backoff time in its internal data structure.
  2. Peer B receives the PRUNE message after a short delay of lets say one second and stores the 60 second backoff in its internal data structure.
  3. Exactly 60 seconds after peer A sent the PRUNE it might decide to GRAFT to peer B again and sends a GRAFT.
  4. This time the message gets received and processed by peer B in less than a second and therefore for peer B the backoff is still active and it declines the GRAFT and punishes peer A.

To circumvent such punishments peer A should consider some slack time before it sends a GRAFT to peer B. So I propose adding some slack times to backoffs after sending a PRUNE. Note that this slack time should only be considered for sending GRAFTs and not for checking if an incoming GRAFT is allowed (in this case we consider the real backoff time without the additional slack).

vyzo commented 4 years ago

There is some slack time already; the backoffs are cleared every few ticks, so we won't try to send a GRAFT immediately after expiration.

blacktemplar commented 4 years ago

That is not a real slack in my opinion. In the extreme case it might happen that the next tick that clears the backoffs gets executed directly after the 60 seconds are over in which case again peer B would punish peer A.

vyzo commented 4 years ago

Fair enough, we could add an explicit slack of 1s or something before attempting the next GRAFT.

vyzo commented 4 years ago

You also need to consider the time it gets for the GRAFT to get there, which should roughly match the time it takes for the backoff to expire in the other side if we initiated the PRUNE. But RTTs can be variable, so agreed this could be problematic in the edge case.

vyzo commented 4 years ago

fix in #374.