Discussion: Identifying improvements for resolution systems

miyazono commented 5 years ago

From a discussion that started with IPNS name resolving @aschmahmann

go-IPFS wants IPNS to have <1 second resolution, but decentralized peerID resolution can end up taking more than 1 second. They are asking for a pseudo-centralized solution in the meanwhile. They are thinking of using something like DNS, but would it be sufficient to use rendezvous? What do you think the delivery timing is for rendezvous?

@miyazono

I’ve been kinda kicking around the idea of other, more complicated graph topologies than that of Chord in my mind and what that would look like, but that was in the context of trying overhauling the DHT for provider records, and I’m starting to wonder how connected peerID resolution is.

@aschmahmann

My thought is that PeerID resolution is halfway between a naming system resolution (e.g. IPNS) and provider records (e.g. IPFS). All three of these systems amount to advertising "You can find X at Y". IPFS provider records say "you can probably find the data with hash=Hash(data) at this address" which, even though we don't ask for confirmation they have and can provide the data (feels similar to Filecoin retrieval market issues), we know we'll get the right data when we see it. IPNS records say "You can find the data corresponding to Hash(Key) at Hash(data)", which while we can confirm with signatures we have trouble guaranteeing freshness. PeerID resolution, like IPNS, is mutable data resolution and requires freshness guarantees. However, because we can dial the peer and check that they have the correct PeerID it has the verifiable quality of IPFS provider records.

Another example of PeerIDs being pretty closely related to provider records is DHT PubSub routing. My understanding is that I advertise "I care about channel C" by putting a provider record for Hash(C) into the DHT. This effectively turns a (channel) ID into network addresses as PeerIDs do.

miyazono commented 5 years ago

Thanks @aschmahmann. The similarities are definitely pretty clear, and having the three systems described like that is incredibly helpful.

From a brainstorming standpoint, it seems like we're only constrained by the fact that we don't influence where the records are. Using a Kademlia-based DHT for provider records, we've picked some point on the time/space tradeoff where nodes each store less of the database resulting in more hops to resolve the request. I'm interested to learn how we do IPNS and PeerID resolution (do we use a similar DHT?) but I'm happy to read wherever that's documented.

The IPNS freshness issues seems interesting, and I'd assume there's lots of thought and active research on that, since the problem seems pretty easily generalizable.

I'm assuming that common paths for solving resolution time/space tradeoff problems are

unequal distribution of records, with supernodes that hold more records or connections
influencing/dictating which node stores which records. I'd bet there are more in the literature. I think it'd be incredibly interesting to summarize everywhere this problem shows up for us, how we deal with it, and see if we could get collaboration on it.

aschmahmann commented 5 years ago

@miyazono we currently use the same DHT for everything and use namespaces (/ipns, /pk, /ipfs) to differentiate the request types. However, there's some thought that maybe we should be breaking the DHT up based on request type or other protocol parameters. There's some thoughts about these DHT related questions at libp2p/notes#10, ipfs/notes#291, libp2p/research-dht#6 (and likely some other places I'm forgetting right now).

You're absolutely right that we're trading off spreading the data around the DHT for additional hops. My understanding is that two big issues with our Kademlia DHT are:

Kademlia fundamentally relies on the triangle inequality and that because the XOR metric (XOR(A,B) interpreted as an integer) satisfies the triangle inequality we can use the XOR metric between peerIDs to route our way towards our data.

Unfortunately, because of asymmetric connections (NATs, firewalls, etc.) the triangle inequality doesn't actually hold on our networks. For example, say Alice, Bob, Charlie, and Dan have peerIDs (Alice, 0000), (Bob, 0011), (Charlie, 0111), (Dan, 1111). Also say that Alice, Bob and Charlie are all connected to each other, but Dan is only connected to Bob. If Alice tries to find data stored at 1111 she will ask Charlie since he is the closer than Bob. However, because Charlie isn't connected to Dan Alice won't find him.

As a result, all users on asymmetric connections are effectively adversarial users. Hopefully this should be helped by properly implementing S/Kademlia which will help us better deal with adversarial nodes (see libp2p/go-libp2p-kad-dht#146 and libp2p/go-libp2p-kad-dht#204). However, if instead of a fraction of nodes being adversarial we have many adversarial nodes on the network (e.g. NAT = adversary) then we're going to need to rely on making those nodes no longer problematic. This could include leveraging the newly implemented autorelay system, adding systems to separate out local peers from global peers, etc.
The DHT is global and used for various sorts of data and protocols. Even for a single data type there's some balance between the number of nodes in the network, amount of times data is place in the network, the size of routing tables on individual nodes and lookup time. There are tradeoffs to be made here, we can't win on every front (e.g. 1B nodes, data is placed in the network twice, 1KB routing table, 1 round trip to find data). The DHT issues I linked above have some people's thoughts on the matter.

bitcard commented 3 years ago

mark

libp2p / notes

Discussion: Identifying improvements for resolution systems #17