libp2p / go-libp2p-pubsub

The PubSub implementation for go-libp2p
https://github.com/libp2p/specs/tree/master/pubsub
Other
328 stars 186 forks source link

Improve handling of disconnected peers in PubSub #506

Open Wondertan opened 1 year ago

Wondertan commented 1 year ago

Context

Right now, the implementation handles dead peers smartly but uncommonly. PubSub determines a dead peer if the write stream returns EOF or other error. This is possible due to the dual-stream model, where peers establish long-lived outbound streams to each other for writes and receive inbound streams for reads.

Problems

Improvement Paths

Other ideas?

vyzo commented 1 year ago

The advantage of the current system is that it is very responsive and it also works at the stream level.

Migrating to use the event bus (which appeared in libp2p way after pubsub :) may be worthwhile, but I would suggest that we improve the current peer detection to remove allocation first. This should be straightforward to implement, and it is not mutually exclusive with using the event bus later.

Wondertan commented 1 year ago

@vyzo, should we implement this or wait for inputs from other folks:?

vyzo commented 1 year ago

I think we can implement it as it is, it will fix the (minor) dos vecror.

Wondertan commented 1 year ago

Ok. Follow up question. Why do we need to read specifically varint or is one byte enough? Should we kill a peer or reset a stream if the unexpected message arrives?

vyzo commented 1 year ago

I think just a byte is enough. Killing the stream is reasonable i think, it may well be unusable at this point and indicative of a bug on the other side.

vyzo commented 1 year ago

There is also a backoff mechanism for dealing with dead peers, so we might as well declare the peer dead and enter it.