Open MichaelMure opened 3 months ago
We already have a PX mechanism as part of mesh pruning; a common pattern is to setup (some) bootstrap nodes with PX enabled and the mesh degree set to 0. That way you can leverage the bootstrap nodes for peer discovery.
Regardless, I am open to adding a general PX control message, but we would have to work through this in libp2p/specs.
a common pattern is to setup (some) bootstrap nodes with PX enabled and the mesh degree set to 0
Could you detail that a bit more? I've tried that option and found that it had zero useful effect. I also couldn't find documentation on how I was supposed to use it.
What I saw was (even though my program was roped for graceful termination of the libp2p components) when peered with that single bootstrap node and that node getting shutdown, no failover to another peer happened and the pubsub peers would end up disconnected from each other, even though they were receiving messages from each other a moment before.
You need to configure a score above the PX threshold for your bootstrapper, otherwise the peers will ignore PX.
You can do this with an application level score.
and don't forget to enable PX emission in the bootstrap node!
I've been playing with GossipSub recently, and I noticed that pubsub doesn't consolidate its peering by itself. To be clear, I'm not talking about the general peer discovery (as in, "find peer for that topic as I have none"), I'm talking about pubsub receiving messages from peers in that topic, but not trying to connect directly even though the gossip topology is below the requirements.
As I understand, the expected solution is to pair GossipSub with a DHT and the WithDiscovery() option, so that pubsub can ask for more peer when below the topology requirements, but that's a quite heavy solution in my opinion, especially when pubsub already know those peers, just not their multiaddr.
If no DHT is setup, the peering is very brittle. If there is a single bootstrap node, every peer's communication will go through that bootstrap in a star topology, with an obvious single point of failure.
What I've ended up doing is adding two extra messages in my tiny app protocol, and package that into a WithDiscovery():
It works really well, the topology grows and is resilient. When some peers are in the same LAN, they even find each other directly without mDNS. .... but that feels gross and really sub-optimal. Flooding the topic with answers is bad, having an external component is bad, having to integrate that into the app protocol is bad.
Would it be possible to have an opt-in solution so that pubsub itself query for other peers, as part of the pubsub protocol? That would be so much cleaner and efficient.