libp2p / go-libp2p-pubsub

The PubSub implementation for go-libp2p
https://github.com/libp2p/specs/tree/master/pubsub
Other
309 stars 178 forks source link

About the possibility to skip the HelloPacket #460

Open gfanton opened 2 years ago

gfanton commented 2 years ago

At Berty we implement our conversation groups based on go-orbit-db stores (go-orbit-db is a golang port of js-orbit-db) which themselves rely on libp2p pubsub.

Since Berty users will need to exchange messages asynchronously (send a message to a receiver who's offline), we want to provide some highly available nodes. Their role would be to replicate the messages of a group to which they have been added (NB: they won't be able to decipher the messages) and forward messages to members of the conversation who were offline at the time of sending.

To do this, we would need to have some kind of supernode capable of subscribing to a very large amount of pubsub topics. But we are facing a problem: due to the gossip implementation, each peers on the pubsub will exchange the topics they are subscribed to (HelloPacket) and as mentionned on https://github.com/libp2p/go-libp2p-pubsub/issues/402, it will silently fail if too many topics have been joined and the list exceeds the maximum message size. The supernodes we are planing to setup will rapidly reach this limit.

Do you think it would be possible / desirable to have a supernode mode for the pubsub that would not require to exchange the list of topics?

Do you have any suggestion on this?

vyzo commented 2 years ago

it is not as simple, without the hello packet the other nodes will not see them as part of their topics of interest.

A possible approach is to have these supernodes send a hello packet in response from a lighter node and only include topics of common interst.

This could work, until you connect two supernodes togethet, where it would break badly.

gfanton commented 2 years ago

A possible approach is to have these supernodes send a hello packet in response from a lighter node and only include topics of common interst.

Sounds good to me, but it raises some questions:

Maybe you will have other suggestions / remarks that will come to your mind if I describe you the complete flow: (Knowing that a supernode / a replication node is the same thing)

vyzo commented 2 years ago

ok, maybe badly was too strong; basically supernodes wont talk or mesh with each other. But given the architecture you describe maybe it is not so bad, might even be desirable!

vyzo commented 2 years ago

On the architecture side, what you describe is very sane and i think it could work well.

There shouldnt be an issue with supernodes having many topics and yes, it would make sense to use larger meshes and different gossip factors for them. You may also want to enable PX in them.

vyzo commented 2 years ago

Note that I will accept a pr implementing an option (WithLazyHello might be a good name) that enacts the lazy hello policy.

Basic outline would be to not emit the eager hello and respond to every subscription with the corresponding subscription if a topic of interest.

BigLep commented 2 years ago

@gfanton : do you think you'll be able to make this contribution?

2022-01-07 discussion: this is in the same theme as "how do I handle a large number of topics" (discussed more in https://github.com/libp2p/go-libp2p-pubsub/issues/402 ).

gfanton commented 2 years ago

We definitely need this feature for scalability reasons. Unfortunately, we have other, higher priority tasks to deal with at the moment. I can take care of it, but it will not be in the immediate future.