Proposal: a radically simpler Peerstore

marten-seemann commented 1 year ago

Observation: The current Peerstore does a lot of things, but it doesn’t any of them well.

Problems:

It’s a giant interface (32 methods in total (!!!), plus 2 for the CertifiedAddressBook).
Entries are GC’ed when the peer disconnects.
It’s trying to offer applications a range of options to store peer-related data, without offering options to persist this data beyond the time a peer disconnects.
It doesn’t keep track of any metrics (e.g. regarding peer addresses stored).
SupportsProtocol doesn’t distinguish between Identify having completed and not, leading to complexity in applications (example).

For example, this shows up:

In QUIC / TLS: it can’t be used to store session tickets for session resumption / 0-RTT.
In Kademlia: it can’t be used to store peer / provider records.

This is a proposal to radically shrink the peerstore, such that it offers only what is needed to dial connections. Applications running on top of QUIC, and even transports like QUIC that want to use 0-RTT, will need to come up with their own data structures to store peer-related data. This is inevitable, since different applications will have different needs regarding persisting data when a peer disconnects, and different heuristics for limiting the size of these data structures.

In particular, we’ll get rid of the following components:

PeerMetadata: Only used to save the agent / protocol version sent in Identify. There’s no real reason to save that information in the first place.
Metrics: Only used to store RTT statistics. Arguably, that’s not a per-peer, but a per-connection (or rather, per path, if dealing with multipath transports) property. We should consider exposing the RTT on the Connection interface.
ProtoBook: Used to store the protocols supported by this peer (as learned via Identify). Takes up huge amounts of storage, for little benefit. Protocols that need a list of peers supporting their protocol can easily subscribe to the respective EventBus notification. [^1]

The KeyBook will be radically simplified to only allow storing of the public key.

As a result, this leaves us with a peer store that only stores:

addresses
public keys
certified peer records (in their serialized form)

It basically becomes an address book, and maybe we should honor that fact by renaming it accordingly.

The following changes will make it a lot more smarter and more useful:

Don’t GC on disconnect. Keep a scoring function of how valuable an entry is, and GC based on that. An entry is valuable, if we’ve connected to a peer multiple times, the node has been online for a long time, or an application tells us that a peer is (based on some application-defined metric, not sure yet how exactly this would work).
Keep track if an address is received securely (e.g. in a signed peer record, on an encrypted connection to that peer, or if we dialed / accepted a connection on that address and completed the handshake). If so, we could ignore all unverified addresses.
Keep stats on how well addresses work in practice, by tracking attempted and successful connection attempts on a per-address basis. This can be fed into the dialing logic to speed up future dials.

We could consider removing the Peerstore interface. This is (mostly) a libp2p-internal component. In particular, it does not make sense to implement a datastore-backed version of it (it might however be interesting to be able to serialize the current state, so it can be restored easily).

[^1]: Note that this breaks an optimization for Host.NewStream, which currently picks the first supported protocol from a list of protocols. This is fine, since 1. worst-case, this results in an extra round-trip, 2. it is a rare use-case and 3. the application can track which protocols a peer speaks and then open a stream with just that protocol, speeding up things by 1 RTT compared to the status quo.

Wondertan commented 1 year ago

The Peerstore(or AddrStore or whatever name this component ends up with) is still very useful for application usage, so please keep it. Furthermore, the datastore-backed implementation is helpful for networks wishing to bootstrap from non-bootstrapper peers, so please keep it as well. Such applications would store peers in the component provided by go-libp2p instead of reinventing the wheel. If libp2p does not provide any means to persist addresses and relies only on the centralized set of bootstrappers, is it really p2p?

These two become even more helpful together with application-defined metrics. Currently, only GossipSub does extensive scoring in the libp2p world, but GossipSub's peer quality metric is also meaningful for other protocols. Usually, if GossipSub thinks a peer is behaving poorly, the chance it behaves poorly in other protocols is high. Potentially, multiple protocols would talk to some sort of libp2p provide PeerScorer/Tracker to maintain a subjective image of the peer.

marten-seemann commented 1 year ago

Thanks for your feedback @Wondertan.

The Peerstore(or AddrStore or whatever name this component ends up with) is still very useful for application usage, so please keep it.

Unless the application is only interested in storing data for the lifetime of the underlying connection, it's not really useful, is it? And even that use case can trivially be replicated with a map and ~10 lines of code for handling the event bus notifications. For most applications, a more sophisticated cleanup logic will be required.

Fundamentally, this is a question about the correct layering. Why would libp2p expose a general interface for peer-related information? Just because we define the peer.ID type? We currently try to do that, and fail pretty badly at it.

If libp2p does not provide any means to persist addresses and relies only on the centralized set of bootstrappers, is it really p2p?

libp2p doesn't rely on bootstrappers, Kubo does (did). With the v0.21 release, they're persisting peers and not relying on the bootstrappers for subsequent starts, without using the database-backed peerstore. It really doesn't make a lot of sense to store information that you regularly need in a database, just so you can persist that information on shutdown. That data should live in memory, and be persisted on disk on shutdown (or in regular intervals). The new peerstore could expose a method to export a snapshot of its current state

These two become even more helpful together with application-defined metrics. Currently, only GossipSub does extensive scoring in the libp2p world, but GossipSub's peer quality metric is also meaningful for other protocols. Usually, if GossipSub thinks a peer is behaving poorly, the chance it behaves poorly in other protocols is high. Potentially, multiple protocols would talk to some sort of libp2p provide PeerScorer/Tracker to maintain a subjective image of the peer.

That's not possible with today's peerstore either. I agree that this could potentially be useful, but will require some careful abstractions. Not every implementation will / should be familiar with the intricacies of GossipSub's scoring function. I doubt that the right approach is to extend the existing Peerstore interface to 40 methods to solve this use case.

MarcoPolo commented 1 year ago

Thanks for writing this up. I agree with all points.

I'm thinking about how to ease the transition for users who are relying on this. Maybe we could expose a new package that lets user protocols still use the existing Peerstore interface (peerstore.New(host.Host) -> peerstore). Users will have to pass this into protocols rather than using host.PeerStore().

sukunrt commented 1 year ago

I agree with all the points.

How should we handle removing PeerMetadata? AgentVersion and ProtocolVersion are part of the identify specs so users might want access to these two fields. I have used them a few times for debugging, though they weren't really helpful. Also used here https://github.com/libp2p/go-libp2p/issues/2205 We can add this info to the event EvtPeerIdentificationCompleted.

sukunrt commented 1 year ago

@Wondertan what application specific usages do you have in mind other than AddrBook, SupportsProtocol and having persistent storage?

I find the current Peerstore API very confusing. One reason that applications cannot use it correctly is that for all TTLs except permanentTTL, we'll remove peers 30 mins after disconnect. This leads to hacks like this https://github.com/libp2p/go-libp2p-kad-dht/blob/master/dht.go#L549

marten-seemann commented 1 year ago

3. ProtoBook: Used to store the protocols supported by this peer (as learned via Identify). Takes up huge amounts of storage, for little benefit. Protocols that need a list of peers supporting their protocol can easily subscribe to the respective EventBus notification.

Discussed this with @Stebalien. It's probably useful to have access to the protocols supported on a connection. It makes sense to store this information with the connection, not in the peerstore. That would allow us to (explicitly) support a different set of protocols on relayed connections.

l0k18 commented 1 year ago

I just want to add a point relating to the "incomplete" persistence for this, and the way the in-memory store being written to already from the base libp2p.Host:

The store keeps a key under the key suffix /priv which appears to be the raw private key in cleartext.

It's not good key storage policy, even in the memory store, if one's code is already storing the key material in a more secure way, and unaware libp2p.Host is doing this, making all that meaningless, this is memory accessible to any and all other processes owned by the same user.

It also appears to be encoded in protobuf, for an in-memory record, that is decoded everywhere it is used, anyway. Why not use Gob for this? For in memory, there's no migration issue, and for on disk, if it's implemented with Badger, then all clients have Gob available, and are even using it for some of the records.

For now, I'm just working directly with the badger back end, using its built-in encryption capability, and using its iterators to walk the records.

I have already forked the pstoreds package. It seems quite mad to leave it so nearly complete when a very complex edifice of interfaces has been made, that it is the sole implementation for.

I will keep following these related issues, as of course if applicable I would be happy to have my code be used or adapted to finish this task properly.

l0k18 commented 1 year ago

Just wanted to illustrate the point about how this code is incomplete and needs work, this is what I see right at the top of the kv store iteration:

pkg/engine/peerstore_test.go:140              16:39:32  trace gnpmhn3x item /peers/keys/AASQQAQSEEBI6WPMTP7EHG3VE4Q67HRZS6ZS2KMLH24RRMPFKMDJ2IF6GRWLNQQ/priv
([]uint8) (len=36 cap=48) {
 00000000  08 02 12 20 7a 9c 39 9c  6e 8a 1e e5 bb 90 a3 f5  |... z.9.n.......|
 00000010  91 4e ec 06 17 32 f9 3e  58 dd a0 38 01 76 08 7b  |.N...2.>X..8.v.{|
 00000020  45 c6 77 9e                                       |E.w.|

That is the private key of the libp2p host, which if used with the current pstoreds badger back end means writing that key in cleartext in the data store, which means it's able to be ripped from 3 different places, on the disk in the persistent store, in memory in the badger/pstoremem instance, and in the libp2p.Host.

marten-seemann commented 1 year ago

@l0k18 Did you intend to post your private key publicly here? Now it can be ripped from 4 different places 🙃

MarcoPolo commented 1 year ago

It's not good key storage policy, even in the memory store, if one's code is already storing the key material in a more secure way,

I think storing secrets in memory is fine. Many other applications do this (SSH, LUKS, ZFS). I agree it's even better if you can offload the secret to a TPM.

making all that meaningless, this is memory accessible to any and all other processes owned by the same user.

This isn't true. At least on Linux, processes can't read other processes memory unless they opt into it as an IPC mechanism.

I don't think it's obvious that using the pstoreds will result in your host key being written to disk. Maybe another reason to remove this from the public api.

Eric-Warehime commented 1 year ago

it might however be interesting to be able to serialize the current state, so it can be restored easily

This is the main use case which was driving us towards using a db-backed peerstore. Without something writing to disk we need to bootstrap every time a node restarts.

Db-backed would also allow us to store arbitrary data for a peer in the metadata--e.g. if we wanted to serialize gossipsub score for persistence across reboots as well.

Is the future approach intended to move all persistence into each individual application, or just focus it in the discovery mechanism?

marten-seemann commented 1 year ago

This is the main use case which was driving us towards using a db-backed peerstore. Without something writing to disk we need to bootstrap every time a node restarts.

Why's that? There's plenty of hooks to learn about connecting / disconnecting peers. It would be easy (and way more performant!) to hook into the eventbus and periodically save some of the connected peers to disk.

Is the future approach intended to move all persistence into each individual application, or just focus it in the discovery mechanism?

Indeed it is. Storage considerations will vary greatly between applications, and it's not realistic to expect that libp2p will be able to provide a one-size-fits-all datastore.

Eric-Warehime commented 1 year ago

Why's that? There's plenty of hooks to learn about connecting / disconnecting peers. It would be easy (and way more performant!) to hook into the eventbus and periodically save some of the connected peers to disk.

I think at the moment it's because it works out of the box. We also haven't gone through perf testing yet to see what kinds of io patterns the disk-backed peerstore creates. Ideally in a network with low churn you wouldn't be updating the peerstore that often and so wouldn't have that much io.

l0k18 commented 1 year ago

@l0k18 Did you intend to post your private key publicly here? Now it can be ripped from 4 different places 🙃

I always write my cryptography tests to use freshly generated random values. I have a scheme that generates them fast using scalar addition also, but I haven't put it into the tests.

I think at the moment it's because it works out of the box. We also haven't gone through perf testing yet to see what kinds of io patterns the disk-backed peerstore creates. Ideally in a network with low churn you wouldn't be updating the peerstore that often and so wouldn't have that much io.

Badger is not much different in terms of IO usage compared to LevelDB or RocksDB but its split value/key tables mean a lot less writing when only keys are accessed, and makes it useful to put some small parts of the values in the keys.

If the key table would be small, or the app is better with lower key search latency it can be memory mapped and iterators stay purely in memory. For on disk, of course after a lot of edits the GC needs to be done to keep the iterations fast.

I can't see any reason why migration needs to be supported though. Non-Golang KV stores are lagging behind what Badger can do. There is another Go-only KV store called 'pogreb' which is really good for data that is mostly append only also.

MarcoPolo commented 7 months ago

We used to be saving signed peer records in the certified address book, but removed that code because it didn't filter the addresses properly, and the certified address book was not used anywhere else in go-libp2p.

The issue of course, is that another protocol was relying on identify's (undocumented?) behavior here.

Having the existing peerstore be a core component of the host leads to these tightly coupled behaviors. I'm not sure identify needs to be so tightly coupled with the peer store. Another solution would be that Identify has a contract of events it emits and events it subscribes to. During the course of an identify it emits events about what it has learned and other components/services/protocols can subscribe to those events. Users could have a custom PeerStore better suited for their application that subscribes to those events. Or user protocols can consume those events directly.

Stebalien commented 7 months ago

The protocol listens to be stored somewhere even if it's not stored in a peer store. E.g.

Services, when first starting up, often want to find all peers that speak a specific protocol.
Multistream definitely needs to know a peer's protocols so it can select the correct protocol without taking a round trip to negotiate anything.

That's why I suggested associating protocols with connections.

MarcoPolo commented 3 months ago

To jot down a couple of thoughts I've had in the meantime:

It makes sense to expose supported protocols.
It seems valuable for protocols to have a way to store some peer data while the peer is connected, and have it properly GC'd when the peer is disconnected easily. This isn't just about being nice to have, but about protecting against a common DoS vector. Getting this right in our current API is surprisingly tricky.

For 2, here's an API I'd like to propose:

// PeerDataStore is a store for peer-specific data. Data will be cleared some
// time after a peer disconnects.
//
// This is intended for protocols to store peer related data that they want
// cleared after disconnect. It is not intended for different protocols to share
// state. Protocols wanting to do so should send messages on the event bus.
// "Do not communicate by sharing memory; instead, share memory by communicating."
type PeerDataStore interface {
    // PutMetadata stores a key-value pair for a peer.  The key must be unique
    // for the peer.  The key should follow the same rules as keys for
    // `context.WithValue`.  The value will be stored until ttlAfterDisconnect
    // has passed since the peer was disconnected.
    //
    // If you only need to store things while the peer is connected, pass 0 for
    // ttlAfterDisconnect. It will be cleared when the peer disconnects.
    PutMetadata(p peer.ID, key any, val any, ttlAfterDisconnect time.Duration) error
    // GetMetadata returns the value for a key for a peer.
    GetMetadata(p peer.ID, key any) (any, error)
}

There are just two methods. Protocols can do what they want with it, and we will ensure that the data gets GC'd appropriately. This is the interface that protocols (users) interact with. The actual implementation itself has more methods obviously.

Generally protocols won't share their data with other protocols, but for something like the KeyBook that Marten mentioned above we could maybe wrap the PeerDataStore and expose a more specific interface. I could also imagine a couple other exceptions where protocols will want to provide "Read Access" to their peer state (e.g. a cross-protocol peer score). In those cases they too would wrap this and provide a more specific interface.

Stebalien commented 3 months ago

I think we should start by listing what we want to store. For example, storing supported protocols in this manner is going to be very trick to do without race conditions.

MarcoPolo commented 2 months ago

I think my phrasing in my last post was not great. "Protocol" is a very overloaded term, but here I mean it as some piece of code that consumes a PeerDataStore interface, and optionally returns some other interface other users can use. Component is probably a better term here.

For example, storing supported protocols in this manner is going to be very trick to do without race conditions.

There's no race condition because there's a single owner of the data. "Generally protocols won't share their data with other protocols, but ... [they could] expose a more specific interface". To answer the example, there could be a component that owns tracking supported protocols. It's the only reader/writer of the underlying PeerDataStore. Concurrent requests to this component can be serialized via a lock (or equivalent). A component like this could also provide a more specialized interface like SupportedProtos(conn network.Conn) []protocol.ID to return protocols supported for that specific connection as well as the SupportedProtos(id peer.ID) []protocol.ID.

burdiyan commented 1 month ago

I find the current Peerstore API very confusing. One reason that applications cannot use it correctly is that for all TTLs except permanentTTL, we'll remove peers 30 mins after disconnect. This leads to hacks like this https://github.com/libp2p/go-libp2p-kad-dht/blob/master/dht.go#L549

Agree with @sukunrt on this. It's hard to rely on the Peerstore for anything, because of the aggressive GC and TTLs. Trying to keep data in sync between the Peerstore and whatever peer-related data is tracked in the application is also often quite confusing though.

I guess one reason for this is because it's unclear what guarantees Peerstore offers, and what is the right way to use it. Implementing this proposal would help, at least in making it clear that applications should not rely on the Peerstore, and should keep track of the data they might need themselves.

Keeping the interface slim would help applications to implement wrappers around the built-in implementation, just in case they need to. E.g. I could think of a scenario where the application tracks "friend" peers and their latest addresses in some long-term storage, so they might want to wrap the default address-book, and read addresses from the persistent storage when there're no addresses in memory.

libp2p / go-libp2p

Proposal: a radically simpler Peerstore #2355