n0-computer / iroh

peer-2-peer that just works
https://iroh.computer
Apache License 2.0
2.61k stars 165 forks source link

Feature Request: Add content-based filtering for Gossip client rebroadcasts #2721

Open arilotter opened 2 months ago

arilotter commented 2 months ago

Feature Request: Content-based filtering for Gossip client rebroadcasts

Current Behavior

The Gossip client in Iroh's Node currently rebroadcasts all messages associated with a topic that the node is subscribed to, without any content-based filtering.

Proposed Enhancement

Add the ability to filter rebroadcasts based on message content, not just the topic. This would allow nodes to drop messages that are not relevant for some reason, reducing traffic that would be dropped anyways when read by well-behaved nodes. This could reduce bandwidth for applications where messages can be incorrect.

Use Cases

  1. Dropping malformed messages

    • A node could misbehave, either due to a bug or malicious behavior, and if their data doesn't match a checksum, it could be dropped instead of rebroadcasted.
  2. Dropping stale messages after some expiry

    • This could be used to implement a time-to-live (TTL) mechanism for messages
    • It would ensure that only current information is circulated in the network, preventing stale data from needlessly using bandwidth.
  3. Dropping messages from unauthenticated users

    • With some authentication system, a public key might no longer have permission to participate in a network after their broadcast has been sent. You could check if a user is still authenticated before rebroadcasting their message, reducing bandwidth and DoS potential from unauthenticated users.

Proposed Implementation

  1. Extend the Gossip client API to accept a custom filtering function
  2. Allow users to define their own filtering logic based on message content
  3. Apply the filter before rebroadcasting any message

Current API

The current API unconditionally rebroadcasts any messages with the same gossip topic.

let (gossip_tx, gossip_rx) = node
    .gossip()
    .subscribe(topic, peer_ids)
    .await?;

Proposed API

Perhaps we could add filter options to the SubscribeOpts struct, letting consumers set a filter when calling subscribe_with_opts:

pub struct SubscribeOpts {
    pub bootstrap: BTreeSet<NodeId>,
    pub subscription_capacity: usize,
    pub filter: Option<Box<dyn Fn(&SubscribeResponse) -> bool + Send + Sync>>,
}
matheus23 commented 2 months ago
pub struct SubscribeOpts {
    pub bootstrap: BTreeSet<NodeId>,
    pub subscription_capacity: usize,
    pub filter: Option<Box<dyn Fn(&SubscribeResponse) -> bool + Send + Sync>>,
}

I'd say it's not unlikely that the filter function will want to do asynchronous work. E.g. when you want to do a quick DB lookup.

The obvious way would be to just make filter a Fn that returns a BoxFuture, but another way would be to turn the interface inside out, in the external iterators vs. internal iterators kind of sense. I.e. when you receive a message, you get a continuation with it. So unless you call SubscribeResponse::continue_rebroadcast, it's essentially filtering out the message.

This way you have the context in which you accept subscriptions at your service (variables in scope, an async context, etc.). But it's also a way bigger change & needs some thinking in terms of what this means for the gossip's actor loop, and you can wreak so much more havoc this way.

arilotter commented 2 months ago

I was reading thru the codebase and it seems that subscribeopts ends up getting serialized and sent over rpc, which presents a big problem for a function :sweat_smile:

matheus23 commented 2 months ago

In that case we could split up the subscribe function into two functions as I described above: One that streams in events of arrived messages, and then the another on the messages themselves to allow explicitly rebroadcasting them.

We could also still keep the original subscribe function for backwards compatibility.

That said, this is getting out of my depth here, so I'd love to see what some others think. Perhaps @rklaehn ?

rklaehn commented 1 month ago

We will move gossip to its own repo. We are planning iroh 1.0, and gossip will not be part of that.

Having gossip in a separate repo should make it easier to add such functionality. We will move the issue to the new repo once we do the reorg.