probe-lab / go-kademlia

Generic Go Kademlia implementation
Other
17 stars 4 forks source link

Optimize Routing Table Module #2

Closed iand closed 11 months ago

iand commented 1 year ago

ETA: 2023-08-31

Description

Currently most of the DHT Routing Table logic is contained in the repository https://github.com/libp2p/go-libp2p-kbucket. However, having this part of the implementation in a repository distinct from https://github.com/libp2p/go-libp2p-kad-dht is impractical, because of dependency reasons. IMO all code directly concerning the DHT implementation should live in a single repository.

The Minimal Working Modular DHT contains a minimal Routing Table example, without all implementation details and features that are currently available in the https://github.com/libp2p/go-libp2p-kbucket repository (e.g diversity filter). The goal of this subproject is to filter the features of https://github.com/libp2p/go-libp2p-kbucket that are actually useful and migrate them to the refactored DHT implementation. In addition, it would be great to have unit tests thoroughly testing the Routing Table implementation, making sure that the behavior is exactly as expected.

References

Children:

guillaumemichel commented 1 year ago

Different routing tables

https://github.com/libp2p/go-libp2p-kbucket is not optimized because it is not using a xor trie (e.g https://github.com/libp2p/go-libp2p-xor, https://github.com/guillaumemichel/py-binary-trie). The basic routing table should make use of a XOR trie.

We may also be willing to create a ClientRT, because the optimized routing table is different for clients for they don't have the same constraints as DHT servers.

We may want to port the FullRT (https://github.com/libp2p/go-libp2p-kad-dht/tree/master/fullrt) mechanism (it should be superseded by Reprovide Sweep).

Another idea that has been discussed is to implement a LazyRT where a node would keep storing peers beyond the bucket limit, but only refresh peers within the bucket size limit. The RT would contain much more peers and doesn't need to refresh the extra peers, however they may become stale.

Each RT implementation may require its own issue (but not all of them are required before production).

Refresh

For now, no refresh mechanism has been implemented for the routing table(s). The refresh mechanism has to be implemented for all production RTs (multiple RTs can use the same refresh mechanism). See https://github.com/libp2p/go-libp2p-kad-dht/tree/master/rtrefresh for reference.

https://github.com/libp2p/go-libp2p-kbucket/blob/master/bucket_prefixmap.go is required for IPFS DHT refresh. Unfortunately :(

IPFS DHT Routing Table

In the IPFS DHT Routing Table, we want to make sure that peers are sent a FIND_NODE RPC, and not simply a ping during the refresh. (see https://github.com/libp2p/go-libp2p-kad-dht/pull/810)

We also want the peers to be sent a FIND_PEER RPC to make sure they are actually DHT Servers (and not misconfigured nodes) and they are able to answer Kademlia requests (see https://github.com/libp2p/go-libp2p-kad-dht/pull/820)

guillaumemichel commented 1 year ago

Note that there are multiple possible refresh mechanisms routing tables:

iand commented 11 months ago

Last item was https://github.com/libp2p/go-libp2p-kad-dht/issues/936