libp2p / go-libp2p-pubsub

The PubSub implementation for go-libp2p
https://github.com/libp2p/specs/tree/master/pubsub
Other
328 stars 186 forks source link

Global Validators #58

Open Stebalien opened 6 years ago

Stebalien commented 6 years ago

TODO in #55.

Instead of registering per-topic validators as we do in #55, validators should decide whether or not they care about a topic when we first subscribe to it. E.g.,

type Validator interface {
  Validates(topic string) (bool, error)
  // ...
}
whyrusleeping commented 6 years ago

I don't think the validator should pick which topics it validates. I think we should register a validator for a given topic (or set of topics).

vyzo commented 6 years ago

What do we gain by having Validators select the topic they validate? They complicate the interface, both in terms of writing the validator (can't just pass a function) and the validator selection logic internally.

Stebalien commented 6 years ago

For one, multiple subscribers (i.e., pubsub as a service). Otherwise, if we have two subscribers, one will fail to register its validator. I'd expect validators to work like DHT record validators (you register your validators up-front).

Also, garbage collection. If I register a validator for each individual topic, I can't garbage collect them even if I unsubscribe.

vyzo commented 6 years ago

For one, multiple subscribers (i.e., pubsub as a service). Otherwise, if we have two subscribers, one will fail to register its validator. I'd expect validators to work like DHT record validators (you register your validators up-front).

I don't understand why that's different from the per-topic validator implementation. We register the validators once, orthogonal with subscriptions -- if we are to register a validator with every subscription then why did we ditch per-subscription validators?

Also, garbage collection. If I register a validator for each individual topic, I can't garbage collect them even if I unsubscribe.

That would still be true with global validators.

vyzo commented 6 years ago

Per discussion on irc, the big gain from global validators would be validator classes. By using namespaced topics we can register validators for an entire class of topics (eg IPNS) and not have to register a validator for every topic.

Stebalien commented 6 years ago

I've been trying to use validators in IPNS over pubsub and this issue is making it very difficult to do so reliably. At the moment, I can:

  1. Assume that the duplicate (already registered) validator is correct and ignore the error (insecure).
  2. Validate records twice, once in the topic validator and once when processing the value (inefficient).
  3. Track which topics I've already registered validators on (painful).

(and none of these options work in the presence of multiple subscribers)

aschmahmann commented 5 years ago

I've been talking with @Stebalien about this a bit in the context of #184, which has necessitated that we take another look at the PubSub interfaces.

There's a current proposal to shift from PubSub functions like Publish(topic string, ....) to something like topic := JoinTopic(topic string, ...); topic.Publish(...). This has the nice feature of helping separating out the code a bit (global PubSub vs Topic specific operations), and also gives us the ability to deprecate the older interfaces and make any improvements we need to (e.g. adding a context to Publish).

One issue that's come up in the new interfaces is how to deal with Validators and generally if there should exist one shared Topic object per topic, or whether there may exist many handlers for a given topic. If there is only one shared Topic object then it's very easy to make it clear that dealing with Validators is to be handled in application space (e.g. you can only pass a validator in on topic join, so if your application will try to utilize the topic in multiple places make sure they agree on a validator, or alternatively pass in a stateful validator that is extenally updateable). However, if there can be multiple handlers for a given topic then should come to some decisions as to what it would mean for two handlers to each want different validators and how we should deal with that (e.g. have the sum validator be the ORs of all the registered validators for a given topic at the network level and use the specific handlers validator when returning new messages).

@raulk @Stebalien @vyzo Any thoughts you have on this (or the #184 PR in general) would be very helpful.

Stebalien commented 5 years ago

I've created a new issue (https://github.com/libp2p/go-libp2p-pubsub/issues/198) to discuss this as it's a bit off topic here. I've also tried to fill in the background a bit so anyone joining the discussion can better understand the proposal and motivation.