PubSub API - Githubissues

hackergrrl commented 8 years ago

This issue is to discuss the user-facing API of a pubsub service on top of libp2p.

Here are some pertinent questions to stimulate discussion:

What is the minimal API?
Who can publish to a topic? Are there per-topic permissions?
Are messages encrypted? Can anyone listen in on a topic?
Do subscribers expect in-order delivery?
Can messages be dropped?
Do new subscribers expect to be able to fetch history?

hackergrrl commented 8 years ago

cc @haadcode @whyrusleeping @nginnever @jbenet

hackergrrl commented 8 years ago

My thoughts:

Minimal API

A node can: subscribe to a topic (and receive events when new messages arrive), publish a message to a topic, and unsubscribe from a topic.

Who can publish to a topic?

One option is to have the topic name be the public key of a keypair, where the private key is necessary to sign published messages that will be accepted.

To permit more complex permission models (where other nodes could be granted permission to publish or have permission revoked), there could be a merkle dag sidechannel where the root publisher can sign messages that state things like "this pubkey can also publish" or "I'm revoking this pubkey's ability to publish" or "this pubkey can publish but only if the messages pass this predicate". This permission log could be replicated across the pubsub swarm to keep members synchronized on the permission state of the topic.

Are messages encrypted? Can anyone listen in on a topic?

Some options:

Encryption could happen entirely on the application level. (out of scope of pubsub)
Publisher(s) could maintain a whitelist of peers and encrypt messages s.t. only they can read them.
Publisher(s) could pre-share a topic-wide shared key for decryption.

Do subscribers expect in-order delivery? Can messages be dropped?

Easier if they don't. Sequencing could be built on top of a dumber pubsub, since not all applications want the trade-offs that come with this property.

Do new subscribers expect to be able to fetch history?

This should be built as a layer on top of dumb ephemeral pubsub.

haadcode commented 8 years ago

@noffle agreed with everything you said :+1:

nginnever commented 8 years ago

@noffle I like the questions raised here, they hit home with my apps.

Really interested in the idea mentioned in who can publish to a topic/chan. Our current implementation is RSA keys with pending support for other crypto algorithms. I was thinking when you mention "I'm revoking this pubkey's ability to publish" or "maintaining a white list" of cjdns and how you can kick nodes off your connection by simply editing the config file. I'm not too familiar with the mechanics behind cjdns but i think the configs just maintain specific keys for each person if you want to maintain a nice list of who has access. Sharing the keys in a distributed manor may be difficult however.

Which brings me to the question. What would be a good mechanism for sharing decryption keys or private keys needed to be stored in a config or white list (like cjdns) in a distributed manor? There is always zero knowledge proof of knowledge but that is some new and complicated mathematics.

These questions may be more appropriate for later and getting something simple working would be great.

hackergrrl commented 8 years ago

@nginnever: thanks for the comments!

Which brings me to the question. What would be a good mechanism for sharing decryption keys or private keys needed to be stored in a config or white list (like cjdns) in a distributed manor? There is always zero knowledge proof of knowledge but that is some new and complicated mathematics.

Is this for the case where you want to have a private pubsub swarm where only those with keys can pub or sub? Or where you want to share keys to prospective publishers?

If you know the PeerID of your publisher, then it's just a matter of encrypting the publish key with their public key and sharing that object's IPFS hash to them somehow. A more involved process might involve a per-pubsub-topic permissions log like I mentioned above, where the whole swarm can keep tabs on who is allowed to do what.

Oh, and as @diasdavid pointed out, there's already a lot of discussion already on https://github.com/ipfs/notes/issues/64.

nicola commented 8 years ago

great review of Pub/Sub and P2P (2013) http://dl.acm.org/citation.cfm?id=2543583

content-based vs topic-based

The type of choice we have to make is whether we want to build a topic-based pubsub or a content-based pubsub. The first one will give us subscription to hashes, the second one would give up subscription to some attributes in the values. So for example one could subscribe to every IPLD object that has title: "Hello" (although we could just re-use content-based pubsub to achieve something like $mutable_hash/path/*).

Although topic-based is just easier, content-based may give us the same properties (and more)

dht-dependent vs dht-indepentent

Do we want our system to re-use the DHT that we already have or can it be separate from it? For example using other overlays e.g. gossip?

nicola commented 8 years ago

Relevant conversation has continued during the workshop here: https://github.com/ipfs/2016-Q3-Workshop/issues/17

ipfs / notes

PubSub API #118

Minimal API

Who can publish to a topic?

Are messages encrypted? Can anyone listen in on a topic?

Do subscribers expect in-order delivery? Can messages be dropped?

Do new subscribers expect to be able to fetch history?

content-based vs topic-based

dht-dependent vs dht-indepentent