libp2p / specs

Technical specifications for the libp2p networking stack
https://libp2p.io
1.58k stars 275 forks source link

Simple, minimal peer exchange protocol #222

Open raulk opened 5 years ago

raulk commented 5 years ago

This issue proposes a general-use peer exchange protocol, that is not embedded in any specific protocol like gossipsub/episub.

The goal of PEX is to enable peers to share records about other peers they're connected to in a 1:1, ad-hoc fashion. It does not intend to produce deterministic results like DHTs, nor does it rely on a structured network or shared heuristic. BitTorrent uses PEX to streamline tracker-less peer discovery.

In the context of gossipsub, PEX proves useful to find additional peers in a topic we subscribe to, as a way of strengthening our topic mesh. Through subscription beaconing (ie. peers gossiping about which topics they're subscribed to), it can even be possible to bootstrap a topic subscription without hitting the DHT, or other structured discovery mechanisms at all.

I'm thinking we should spec out a minimal PEX protocol, consisting of a simple advertisement schema, and two operations: advertise, lookup.

Advertisement schema

An advertisement struct consists of a peer address record and a set of CIDs we are advertising, signed by the peer's key to prevent MITM attacks.

Local advertisement record maintenance

Our advertisement record is kept in memory and updated at runtime. It is populated with:

The Host API would expose methods so that downstream components (e.g. protocols) can manage advertised CIDs, e.g.:

// We don't want to add an accessor for PEX in the Host interface.
// The host-service refactor is a prerequisite to be able to do this.
svc, ok := host.GetService(&PEX{})
if !ok {
    return nil
}
pexsvc := svc.(PEXService)

ad := pex.NewAd("gossipsub:topic_name")
cancelFn, err := pexsvc.Advertise(ad)
if err != nil {
    return err
}

// ... store the cancelFn in state ...

// atomically replace the advertised value, possibly not useful for gossipsub, 
// but it will be for other protocols.
// Helps mitigate add/remove noise when sending deltas.
ad.Replace("gossipsub:topic_name_b")

// when done / closing down
err := cancelFn()
if err != nil {
    return err
}

Advertise operation

Upon establishing a libp2p connection:

  1. We open a stream for protocol ID /libp2p/pex/v01.
  2. If successful, we push our local advertisement record.
  3. When receiving a record, we store it in memory.

We repeat the above when advertisements or addresses change. Note that process looks a lot like the identify protocol logic. We could extend the identify protocol to support advertised CIDs. Note that protocol IDs are insufficient to contextualise an advertisement (e.g. we want to know that a peer is a member of gossipsub topic abc, not that it supports gossipsub).

Lookup operation

When the local application/protocol intends to look up peers advertising a specific CID, it sends a lookup RPC to all connected neighbours, who reply with the advertisement records of all directly connected peers they know to be advertising the CID.

If a peer returns irrelevant/malformed/badly signed ads, we decrease their score on the grounds of displaying malicious behaviour. Below a certain threshold, we blacklist/disconnect the peer.

In its basic form, the lookup operation extends our view of the network by degree 2 (we reach peers of our peers), but it can be further enhanced by a TTL mechanism that allows the request to be relayed N number of hops. Thus, if a peer knows of zero peers advertising the CID, it could relay the request to its neighbours.

I propose we don't venture with relayed lookup requests at this stage, as it requires thoughtful modelling of rate-limiting, quotas, and scoring, to prevent DDoS attacks. But it's definitely something to keep in the radar.

Privacy reflections

Just like with DHTs, it's hard to guarantee reader privacy. PEX could be used to map out how peers interested in a certain subject are effectively connected. We can introduce randomness to deter such attempts.

vyzo commented 5 years ago

We might also want to have a push protocol for advertisements instead of relying on poll lookup.

jbenet commented 5 years ago

PEX

great to see this here! 👍 -- we've needed something like PEX in libp2p for a long time

ad := pex.NewAd("gossipsub:topic_name")

oh cool, i didn't recall PEX kept specific topics/swarms associated with each peer. makes sense. We probably want to do something like tags actually:

Get("gossipsub:topic_name") # get all peers related to this gossipsub topic
Get("providers:<selector>") # get all peers related to this ipld selector
Get("transport:QUIC") # get all peers that have QUIC
Get("kad-dht") # get all peers that speak kad-dht
Get("filecoin") # get all peers that speak filecoin
Get("filecoin:retrieval") # get all peers that speak filecoin:retrieval
Get("kad-dht", "gossipsub:topic_name") # get all peers related to this gossipsub topic, and who speak kad-dht

In this sense, maybe we should be doing pathing (/ separated), and re-using the protocol identifiers we already use (for uniqueness and default simplicity):

Get(Path(gossipsub.ProtocolID, "topic_name"))
Get(Path(providers.ProtocolID, selector))
Get(Path("transport", quic.ProtocolID))
Get(Path(filecoin.ProtocolID, filecoin.RetrievalProtocolID))
Get(Path(gossipsub.ProtocolID, "topic_name"), Path("kad-dht))

In its basic form, the lookup operation extends our view of the network by degree 2 (we reach peers of our peers), but it can be further enhanced by a TTL mechanism that allows the request to be relayed N number of hops. Thus, if a peer knows of zero peers advertising the CID, it could relay the request to its neighbours.

not sure we should even reach peers-of-peers, but maybe.

I propose we don't venture with relayed lookup requests at this stage, as it requires thoughtful modelling of rate-limiting, quotas, and scoring, to prevent DDoS attacks. But it's definitely something to keep in the radar.

Yeah i think this needs to be explicitly out of scope for this protocol. this should be a very simple 1-1 protocol (or just about).

Just like with DHTs, it's hard to guarantee reader privacy. PEX could be used to map out how peers interested in a certain subject are effectively connected. We can introduce randomness to deter such attempts.

yes 👍

Security Considerations

thomaseizinger commented 3 years ago

I think the rendezvous protocol might allow to implement this. Advertisements are essentially registrations. Namespaces are free-form text so clients can store all kinds of stuff in there. As long as the format is agreed upon, it can be used to advertise gossipsub topics, CIDs, etc

Menduist commented 2 years ago

I think the rendezvous protocol might allow to implement this.

I second this, what's missing from rendezvous to be considered a Peer Exchange protocol? We could have a mode where it's stored in a db, and another where it feeds directly from the PeerStore / other structure in memory

We just need a "auto register", which will register us every time we connect to a peer with RDV enabled

thomaseizinger commented 2 years ago

I think the rendezvous protocol might allow to implement this.

What is missing is some kind of standardisation, how the registrations are structured, i.e. what the namespace is.

For example, how do you take a gossip-sub topic and advertise it via rendezvous?

It should probably be prefixed with the protocol and then some protocol specific parameters, e.g.:

/gossipsub/1.1.0/topic/my_room

AFAIK, protocol IDs are completely opaque so we can't rely on / or any other char being a separator.

I second this, what's missing from rendezvous to be considered a Peer Exchange protocol? We could have a mode where it's stored in a db, and another where it feeds directly from the PeerStore / other structure in memory

That is IMO entirely an implementation consideration and does not need to be part of a spec / protocol.

We just need a "auto register", which will register us every time we connect to a peer with RDV enabled

That is also an implementation choice IMO.

thomaseizinger commented 1 year ago

Trying to reboot this in a simpler form: https://github.com/libp2p/specs/discussions/587