libp2p / go-libp2p-pubsub

The PubSub implementation for go-libp2p
https://github.com/libp2p/specs/tree/master/pubsub
Other
317 stars 182 forks source link

Filter for nodes that we gossip to #443

Open synzhu opened 3 years ago

synzhu commented 3 years ago

Context: https://discuss.libp2p.io/t/filtering-nodes-that-we-gossip-to/1012

We need a mechanism that allows us to provide a validator function that, given a message, filters out nodes that we don’t want to forward or gossip about the message to.

In other words, we want to implement selective gossip / message forwarding.

synzhu commented 3 years ago

@raulk may I start working on a PR for this? This is a very useful feature that we would like to use soon :)

synzhu commented 3 years ago

@vyzo Can you help me transfer this issue to the pubsub repo?

vyzo commented 3 years ago

done; also, you don't need raul's permission to start working on this :)

synzhu commented 3 years ago

@vyzo I was also wondering if you had any thoughts on how this could be implemented cleanly? Originally I intended to do something in a similar vein to the topic validator, but all of the topic validation logic actually occurs in pubsub.go (and validation.go), before Publish is called.

For the selective message filtering however, it will need to happen inside Publish, because we will need to run the filter per-peer.

The way the code is currently structured, it would probably need to go somewhere around here. I think this is not ideal because we need to implement it separately for each Router implementation, even though the logic should basically be identitcal.

Instead, I was thinking of adding the logic in handleSendingMessages, which allows us to removes all the information from the message that we care about before sending it out. This could potentially be made to be even more general purpose than my usecase here, ie a function that allows you to modify each message right before it is sent out.

What do you think about this? The one catch with this approach is that I would need some way to get the peer.ID associated with a Stream so that it can be passed to the filter function, which I don't think is possible today. So maybe I'd have to work on an issue for that as well. Either way, this is actually a functionality we need for a different usecase anyways.

Also, the above still does not entirely solve the problem for my specific use case. I want to be able to prevent all information about a certain message's existence from passing on to certain peers, including IHAVE messages, and in particular my criteria for choosing these messages is based on the message's original sender.

In other words, I make the decision based on who the message came from, and who the information would be going to.

However, IHAVEs are stored as message ID's in the RPC, so just by looking at an RPC alone I cannot tell which IHAVE's I need to remove. If I have something like the above implemented, I could probably get away with remembering this data at the application level and looking at it when my filter function is called, but this is a bit hacky.

vyzo commented 3 years ago

I dont think putting it in handleSendingMessages is correct, the right place to put it in the peer selection.

Simplest way is to have a filter of the form func(topic, peer) bool which we apply inside the peer selection for gossip.

For publishing messages, a similar (or the same) filter can be applied, in peer selection for mesh construction and flood publishing selection.

The default would be a func that returns true unconditionally.

Note: its very easy to get the peer for a stream, s.Conn().RemotePeer()

On Fri, Aug 13, 2021, 12:11 Simon Zhu @.***> wrote:

@vyzo https://github.com/vyzo I was also wondering if you had any thoughts on how this could be implemented cleanly? Originally I intended to do something in a similar vein to the topic validator, but all of the topic validation logic actually occurs in pubsub.go (and validation.go), before Publish https://github.com/libp2p/go-libp2p-pubsub/blob/master/gossipsub.go#L943 is called.

For the selective message filtering however, it will need to happen inside Publish, because we will need to run the filter per-peer.

The way the code is currently structured, it would probably need to go somewhere around here https://github.com/libp2p/go-libp2p-pubsub/blob/master/gossipsub.go#L1008. I think this is not ideal because we need to implement it separately for each Router implementation, even though the logic should basically be identitcal.

Instead, I was thinking of adding the logic in handleSendingMessages https://github.com/libp2p/go-libp2p-pubsub/blob/master/comm.go#L155, which removes all the information from the message that we care about before sending it out. This could potentially be made to be even more general purpose than my usecase here, ie a function that allows you to modify each message that is sent out.

What do you think about this? The one catch with this approach is that I would need some way to get the peer.ID associated with a Stream https://github.com/libp2p/go-libp2p-swarm, which I don't think is possible today. So maybe I'd have to work on an issue for that as well.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/libp2p/go-libp2p-pubsub/issues/443#issuecomment-898307659, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAI4SREAPOTFZLBQ4RPZNDT4TOTVANCNFSM5CDAY44A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

vyzo commented 3 years ago

Note that the filter would be in the pubsub object, initialzed with a pubsub option, and thus available to each router. We want the router to decide how to use it, there is no one size fits all.

On Fri, Aug 13, 2021, 13:08 Dimitris Vyzovitis @.***> wrote:

I dont think putting it in handleSendingMessages is correct, the right place to put it in the peer selection.

Simplest way is to have a filter of the form func(topic, peer) bool which we apply inside the peer selection for gossip.

For publishing messages, a similar (or the same) filter can be applied, in peer selection for mesh construction and flood publishing selection.

The default would be a func that returns true unconditionally.

Note: its very easy to get the peer for a stream, s.Conn().RemotePeer()

On Fri, Aug 13, 2021, 12:11 Simon Zhu @.***> wrote:

@vyzo https://github.com/vyzo I was also wondering if you had any thoughts on how this could be implemented cleanly? Originally I intended to do something in a similar vein to the topic validator, but all of the topic validation logic actually occurs in pubsub.go (and validation.go), before Publish https://github.com/libp2p/go-libp2p-pubsub/blob/master/gossipsub.go#L943 is called.

For the selective message filtering however, it will need to happen inside Publish, because we will need to run the filter per-peer.

The way the code is currently structured, it would probably need to go somewhere around here https://github.com/libp2p/go-libp2p-pubsub/blob/master/gossipsub.go#L1008. I think this is not ideal because we need to implement it separately for each Router implementation, even though the logic should basically be identitcal.

Instead, I was thinking of adding the logic in handleSendingMessages https://github.com/libp2p/go-libp2p-pubsub/blob/master/comm.go#L155, which removes all the information from the message that we care about before sending it out. This could potentially be made to be even more general purpose than my usecase here, ie a function that allows you to modify each message that is sent out.

What do you think about this? The one catch with this approach is that I would need some way to get the peer.ID associated with a Stream https://github.com/libp2p/go-libp2p-swarm, which I don't think is possible today. So maybe I'd have to work on an issue for that as well.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/libp2p/go-libp2p-pubsub/issues/443#issuecomment-898307659, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAI4SREAPOTFZLBQ4RPZNDT4TOTVANCNFSM5CDAY44A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

synzhu commented 3 years ago

@vyzo Cool, my original concern was that I didn't think we wanted each router to decide how to use it, and that there should be a common way to handle this. But I see now that may not actually be the case, so I have a good idea how to get this done in that case. Thanks!

synzhu commented 3 years ago

Note that the filter would be in the pubsub object, initialzed with a pubsub option, and thus available to each router.

I noticed that topic validators are not initialized with a pubsub option, but rather via RegisterTopicValidator. I assume this is in order to support dynamic register and unregister.

I have no current use case for unregistering my peer selection filter, but do you want me to support this?

synzhu commented 3 years ago

Also, the above still does not entirely solve the problem for my specific use case. I want to be able to prevent all information about a certain message's existence from passing on to certain peers, including IHAVE messages, and in particular my criteria for choosing these messages is based on the message's original sender.

In other words, I make the decision based on who the message came from, and who the information would be going to.

However, IHAVEs are stored as message ID's in the RPC, so just by looking at an RPC alone I cannot tell which IHAVE's I need to remove. If I have something like the above implemented, I could probably get away with remembering this data at the application level and looking at it when my filter function is called, but this is a bit hacky.

Actually, I'm still not entirely sure that this is resolved? Implementing things at the peer selection level will allow me to filter out peers that I forward a particular message to, but it would not prevent me from later gossipping about that message to a filtered out peer or responding to an IWANT for that message from the peer, right?