Providing fast NS lookups: evaluating provider records, querying strategies

The underlying DHT (utilizing the libp2p Kademlia DHT spec) allows peers to query by a TOPIC for a RESULT, or query by TOPIC for all PROVIDERS that have RESULT. In the name system, we use a sphere's identity (DID) as a TOPIC and expect a UCAN auth token (or CID, TBD) as the RESULT, that can be verified and resolved into the sphere's latest revision (CID). DHT nodes handle these queries by sharing key/value records between peers. These records can be interpreted in two ways:

Value Records: The value is the RESULT. Using Value Records is a push model, where the RESULT is propagated to peers.
Provider Records: The value is a PROVIDER (specifically PeerId and Multiaddr) that provides RESULT. Using Provider Records is a pull model, where a node announces it has the RESULT for TOPIC by propagating that record to peers. Nodes querying in this model query for a list of providers of this record (discovery), and can subsequently dial each other. The process of retrieving RESULT from PROVIDER is outside of the scope of the DHT's spec, though the rust-libp2p implementation allows us to hook into this via e.g. Bitswap

Currently, the name system uses Value Records for simplicity as we shape out how the system works. The DHT RESULT is a UCAN token verifying a sphere's key delegate with "sphere/publish" capabilities, along with the final value the name system is responsible for, the sphere's revision address. When a new revision is published, the gateway/NS hosts and sends the new record to its peers, and in intervals to possibly different peers. Currently in the v1 name system, querying the network for results, we grab the first valid record we can find (which in small networks, may be already on the node) -- though there may be many valid/verifiable records for a single identity within TTL. We may want several options during lookups specifying just wanting the first record found (fast!), versus waiting for 3 or more and taking the latest. There's no guarantee that there's not a newer revision that has not yet been discovered in either case.

Using provider records instead, as pragmatically only a sphere's corresponding gateway would be propagating records, could mean a single PROVIDER for a TOPIC. As the actual RESULT is pulled on-demand, the gateway with fresh records will always be the found peer, e.g. fresh results when querying outside of the local cache. This does of course require the gateway to be always available, and still possible for other nodes to announce themselves as providers for the same records a gateway is hosting.

It could be possible the gateway's key has "sphere/publish" capabilities to sign the UCAN auth token, which lowers resilience a bit if permission is needed to be a PROVIDER. There may be an alternative where each gateway must sign its records (only to bind token to gateway, no auth) wrapping the underlying sphere-blessed publish token. This would look like, in the scenario of a non-updated sphere, multiple providers with unique tokens (signed by gateways) containing the same underlying sphere-signed token with identical sphere revision address. In the case where underlying sphere revision address differs, the original sphere's authority could specify the gateway's key, not to restrict permission to publish, but for tie-breakers in record resolution, opting for the token from the "definitive source".

This is getting long, but after thinking through the last part, we could add some facts to the UCAN token with the gateway's key, which should be resolvable into a PeerId e.g. directly addressable via the network. Even using the current Value Record implementation, adding the gateway's key to records could result in the experience of fetching records from peers like currently, and if wanting to confidently get the latest revision, take the gateway key from a possible stale record, and directly dial the gateway for the latest (and storing it, strengthening the network).

subconsciousnetwork / noosphere

Providing fast NS lookups: evaluating provider records, querying strategies #124