pub/sub - publish / subscribe

jbenet commented 8 years ago

We've known for some time we need to layer a pub/sub system on IPFS. We should try to reuse other things. The least work, the better. But it should be a simple protocol, easy to implement, well-layered, and meshing well with the rest of IPFS abstractions.

Requirements

very, very fast
flexible (maybe different topology-forming algorithms)
multiple modalities (single publisher, multiple publishers, etc)
support both encrypted and unencrypted streams (encrypted again, this is above the regular libp2p encryption -- and specific to the pub/sub group)
support privately encrypted channels (ie user supplied keys)
layers over IPRS to do discovery

We need to:

[ ] do a survey of relevant {literature, protocols, and implementations}.
[ ] decide on a protocol
[ ] build it into libp2p.

I likely won't have time to do a proper survey until late Nov or Dec. If you'd like to speed this along, post links to great {papers, systems} here for me.

Relevant to research:

https://github.com/ipfs/ipfs/issues/73 and #ipfs IRC logs.
XMPP and matrix.org
MQTT and other messaging queues.
all the multicast research
all the pub/sub research

jachiang commented 6 years ago

@Stebalien Thanks for the answers!!! @RangerMauve thanks for the great questions I am also learning from!

@RangerMauve Sort of... I'm not sure about js-ipfs but the go-ipfs ipfs pubsub command has an optional 'discovery' option that uses the DHT. Unfortunately, this is kind of a hack. To register "interest" in a topic, one uses the provider system to provide a block with the content floodsub:$topic_name and peers looking for other peers interested in the topic search for peers providing that block.

So the topic needs to be known apriori? If a secret channel topic is not made public for example, there is no way it can be exposed?

jachiang commented 6 years ago

@Stebalien So is the floodsub:$topic_name pattern the way to go for bypassing the DHT not allowing arbitrary get/put? :P

How does OpenBazaar put profile info into DHT? Is it only on their OB network I suppose?

Stebalien commented 6 years ago

@RangerMauve

So is the floodsub:$topic_name pattern the way to go for bypassing the DHT not allowing arbitrary get/put? :P

It's not really an arbitrary get/put. It puts a record mapping CidOfBlock("floodsub:$topic_name") to the peer's ID.

Would a new pubsub implementation be welcome if it conforms to the same public interface?

You're free to do so but it may break in the future. There's a reason this is marked experimental (although we will try to avoid breaking things too badly).

@jachiang

So the topic needs to be known apriori? If a secret channel topic is not made public for example, there is no way it can be exposed?

You do need the topic to subscribe to it (there's no "list all topics" command). However, your node will tell all of its peers which topics it's interested in so they're hardly secret.

How does OpenBazaar put profile info into DHT? Is it only on their OB network I suppose?

Yes. We explicitly do not want to allow arbitrary data on the DHT (too easy to abuse). The correct way to do that is with IPNS (IPNS record points to a "profile" stored in IPFS). IPNS is currently a bit... finicky in practice.

RangerMauve commented 6 years ago

Thank you for all the details @Stebalien!

From the README, it looks like the JS version of the DHT isn't totally implemented. How does pubsub work in the browser if the DHT support is iffy?

Stebalien commented 6 years ago

As I said, that particular peer-finding mechanism is a bit of a hack. I doubt that JS uses it. @diasdavid?

paralin commented 6 years ago

Is there any progression on the pubsub front?

RangerMauve commented 6 years ago

I've been thinking about the state of pubsub, was wondering if I could get comments on the following:

I think that constructing a minimum spanning tree (MST) for groups is the way to go.

Use one of the algorithms out there for Distributed Minimum Spanning Trees
For the "distance" use the logical distance between peer IDs
- Using latency is hard because you'd need to connect to peers before knowing what your latency is
- Logical distance is what the DHT already uses for logic, so you're likely to be connected to peers with a close distance to you already if you've bootstrapped into the dht
Individual nodes that may not be subscribed to much will end up getting more traffic, but there will be less traffic overall in the network
Instead of relaying based on listeners, join "networks" with a given ID and broadcast all events through your network
Have peers publish to the DHT with which pubsub network they're on for bootstrapping the MST
Find more peers by querying the peers you're connected to
Subscribing becomes filtering what data your application will react to when events are passed through the node
- This allows fancier patterns for subscribing like MQTT wildcards
Network IDs can be hashes of public keys and packets should be automatically encrypted with the key using AES or something
- Maybe allow for signing instead of encryption so that there can be nodes that only consume and relay and can't publish
Allow for cleartext network names for where security isn't a consideration (a la current pubsub)
Maybe network ID is based on a hash of a network description object in IPLD (name, encryption type, etc)

paralin commented 6 years ago

Distributed minimum spanning tree does look like the right approach for this, I have to say.

pgte commented 6 years ago

@RangerMauve interesting!

About the "distance" function, I'm thinking that poorly connected nodes could impact the network if the latency or failure rate is not taken into account. Discovering new nodes will be done through some sort of gossip protocol, right? Idea: why not also spread the perceived "health" of nodes in a cooperative way? I think there are already some models for this, from the top of my head:

Accrual failure detection, determines a probability of a node having failed: http://fubica.lsd.ufcg.edu.br/hp/cursos/cfsc/papers/hayashibara04theaccrual.pdf
The failure detection used in the swim protocol: https://asafdav2.github.io/2017/swim-protocol/

Perhaps this could bring some type of decentralised measure of "health", which could then be used to calculate distance?

RangerMauve commented 6 years ago

@pgte The distance function is there to determine which nodes are logically "close" to each other in order to determine who should connect together to form the minimum spanning tree. The main benefit is that it's computationally cheap to determine the logical distance since all you need is the ID of the peer, a node can calculate it's distance to any peer without connecting to it or knowing anything else about it. I'm taking this concept directly from the kademlia DHT design (which is the DHT used in IPFS)

I think peers will only bother learning about the nodes connected to their immediate neighbors, so some sort of gossip protocol will be useful here. I'm still reading up about the different strategies out there for distributed MSPs so I'm not 100% sure what the requirements are yet.

I like the idea of adding weights based on what a node things the "health" of their peers is, I'd be worried about spreading that information though, because two nodes may have a poor connection between each other, but another node may have a good connection to both based on what their physical relationship is.

I think that for an MVP of a MSP type pub-sub protocol, adding in health measures will complicate the implementation, but it should definitely be investigated once we can actually construct a MSP.

pgte commented 6 years ago

@pgte https://github.com/pgte The distance function is there to determine which nodes are logically "close" to each other in order to determine who should connect together to form the minimum spanning tree. The main benefit is that it's computationally cheap to determine the logical distance since all you need is the ID of the peer, a node can calculate it's distance to any peer without connecting to it or knowing anything else about it. I'm taking this concept directly from the kademlia DHT https://en.wikipedia.org/wiki/Kademlia#System_details design (which is the DHT used in IPFS)

Gotcha. I think peers will only bother learning about the nodes connected to their immediate neighbors, so some sort of gossip protocol will be useful here. I'm still reading up about the different strategies out there for distributed MSPs so I'm not 100% sure what the requirements are yet.

I like the idea of adding weights based on what a node things the "health" of their peers is, I'd be worried about spreading that information though, because two nodes may have a poor connection between each other, but another node may have a good connection to both based on what their physical relationship is.

I think that it's exactly what both the Swim.js protocol and the Accrual failure detection model solve: it gives a cooperative measure of health, not a one-sided measure (the swim.js less so). I think that for an MVP of a MSP type pub-sub protocol, adding in health measures will complicate the implementation, but it should definitely be investigated once we can actually construct a MSP.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ipfs/notes/issues/64#issuecomment-362608541, or mute the thread https://github.com/notifications/unsubscribe-auth/AAC7Jhr2UhfYsk9_lC0xA1SOKC6uVS2Lks5tQyI5gaJpZM4GL9fG.

RangerMauve commented 6 years ago

I've been tinkering around and here's a demo of an algorithm I've made for building a MSP. In our case the actual weights don't matter too much because they're arbitrary logical distances.

Every node will first boostrap by getting a list of random other peers.
It will then sort those peers by their logical distance using an xor of their IDs.
It will then choose the closest peer that it isn't already connected to and connect to it

I've made a demo with 32 nodes which boostrap with a list of 6 random nodes.

Here's the link

I'm not sure how viable this is yet, so I'll work on other visualizations in order to catch possible partitions forming.

RangerMauve commented 6 years ago

I've updated my demo to better visualize what's happening.

I've found that having nodes find a random list of other node IDs, then sorting them my logical distance and connecting to the closes one, I always get a fully-connected tree.

I'll explore logic for visualizing messages being sent down the tree (for pub sub) and dropped connections.

paralin commented 6 years ago

@RangerMauve what you're describing really is just a network topology mesh, where your route priority metric is driven only by logical distance.

Look at the Babel routing algorithm, there is a lot we can learn from that. Batman-adv as well.

RangerMauve commented 6 years ago

@paralin Yes. I'm not focusing on routing data across the mesh at the moment. I'm focusing on forming the mesh in a way where individual nodes know every little about other nodes or the shape of the graph itself.

The existing pubsub connects to nodes at random and has a chance of having "supernodes" which have too much bandwidth going to them and network partitions which prevent groups of peers from sending events between each other.

Once an efficient and self-healing mesh is set up, it can be a basis for the protocols that will be built on top (pubsub to start, but other things can probably make use of it).

My initial goal is to make a more efficient pubsub.

I love what the Kademlia DHT does for discovering peers that have a resource, but there needs to be something for forming a persistent network for distributed applications that need to get data to all nodes interested in it.

paralin commented 6 years ago

@RangerMauve Wrt. Babel: I am suggesting we use it as a reference for more reasons -

They also limit knowledge of the network topology to only the next hop.
They avoid routing loops (black holes).
They successfully balance multiple routing metrics, and have experience with understanding which to prioritize when.

The same goes for batman-adv, except that project focuses more on radio transmission optimizations.

This matches with what you're talking about with mesh balancing. It's the same problem we are solving. Routing in the typical internet sense is the same thing as routing here.

It might be a good idea to look at generalizing the concept of routing a bit, and applying it in pub-sub as well as content and network routing.

RangerMauve commented 6 years ago

@paralin Sounds good. Is the RFC a good place to start?

Babel looks like a protocol that should be used after the actual connections are made between nodes. I'm still at the stage where I'm figuring out which nodes should connect to which, but after that this will be great for making the actual traffic over the mesh be more efficient.

paralin commented 6 years ago

@RangerMauve I see what you're saying, they are actually two different problems actually when you consider the structure of the open internet, and don't limit connections to link-local.

In a way though it is actually the same thing. Consider this: all of the peers online in the world are actually just next-hop peers in the babel algorithm. What you're doing is trying to figure out what subset of that global set to connect to, and then prioritize further based on that. But it's still the same problem.

The 'brute-force' solution would rank all of the known peers based on available information about them, used to compute a routing metric. You then sort by the routing metric, chose the top N nodes, connect to those, etc.

Babel fits here as a mechanism of generating that routing metric. It's one of many. I think maybe a pluggable system for deriving routing metrics might be a good approach to this. You can run the fast route scoring algorithms first, before exploring a more advanced plan. It might be possible to map this to some sort of opportunistic graph solver.

RangerMauve commented 6 years ago

@paralin I get what you're saying, but I think that's a bit heavy and requires a node to know a lot more peers. With my current algorithm, a node can search the DHT for the first 20 random nodes it finds, and then connect to the mesh without focusing the load too much in once place. All it needs to do to decide who to connect to is their ID without having to somehow measure latency or any other metrics about them. Otherwise a node would need to connect to the network, learn about potential peers and how they might relate to themselves, and then actually connect to the network. Though this will be more efficient in the long run, the initial setup is likely going to be too long for the time it takes for a typical web app to load.

paralin commented 6 years ago

@RangerMauve We are in agreement, I'm just generalizing what you're describing a bit.

You chose a next peer to send traffic to, then you build a connection to that peer, then you send the packet to the peer. The question of where is best to send the traffic is the same as the question of where to connect to next. What you are proposing, is to start with no information about the peers, and to randomly connect to some from the DHT, and then optimize from there. This is the same as what I described in my last comment - you start with no information, you make a decision at random, and then as you learn more information about the network, the routing can be optimized further.

Babel is just one example of a protocol to quickly exchange information about available routes in a closed network. It does not apply to this issue directly, but it is an example of an efficient way of exchanging information about the network between peers that happen to already be connected.

If we structure the code in such a way that these routing metrics, and the way they are prioritized and balanced together, are pluggable, then we can actually start to make use of the traffic metrics in LibP2P to make better routing decisions.

paralin commented 6 years ago

There must be some way to make a incentive-based system like BitSwap, but for incentivizing routing packets through the network. (Different problem, of course :))

@RangerMauve It's a little bit like a particle filter in a way... If you make a measurement of the ping to a peer, for example, the confidence and derived weight that information has on routing decisions can decay over time (temporal decay) as well as when you have other detected changes in the network topology (event-based invalidation). If we have the structure described above for pluggable metrics, you could get really fancy really quickly with algorithms to make, probably.

ORBAT commented 6 years ago

The BitSwap ledger is already based on bytes sent and received, so I'd assume it should be relatively simple (and I do mean relatively) to extend it to also account for routed bytes (maybe as a separate category, or with some weight etc. since message sizes are likely negligible compared to blocks)

ORBAT commented 6 years ago

Oh, and I don't know if someone's already looked into Chord, but there's been some work on how to build pub/sub systems with it:

This paper gives the basic idea
This paper shows how to implement equality, suffix, prefix and contains predicates for subscriptions
This paper gives an overview of content-based pub/sub systems, including DHTs

There are also a couple of different pub/sub-like schemes over Pastry:

I'd argue that mimicing their approaches might be a good way to go. There's likely similar systems on top of Kademlia, so it might be worth while to look into them. Chord, for example, is provably correct and functions well even with high churn.

edit: for an example of a wildly different approach, there's Secure Scuttlebutt that builds on the Scuttlebutt gossip protocol. Having a gossip protocol in libp2p doesn't sound bad at all.

revenge of the edit: @spikebike talked about PolderCast some years ago, and as he said, it seems like an extremely good choice.

ghost commented 6 years ago

I'm curious if you folks are aware of what these folks are up to. https://github.com/libp2p/go-floodsub/pull/67

paralin commented 6 years ago

Submitting aeron for us to read about and investigate:

It seems to be a publisher/subscriber system that uses efficient UDP multicast or inter-process communication to get the job done.

ORBAT commented 6 years ago

What's the relationship between this discussion and floodsub? Is floodsub the "official" pub/sub for IPFS, or is this discussion completely independent?

daviddias commented 3 years ago

For everyone still following this thread, you will want to check the latest on libp2p PubSub with Gossipsub v1.1

ipfs / notes

pub/sub - publish / subscribe #64