Protocol interface refinement

iand commented 10 months ago

Change the return values of FindNode and Get. We need to return a list of NodeInfo instead of a list of NodeID.

iand commented 10 months ago

Some thoughts about the protocol interfaces:

The RecordProtocol isn't parameterized by the Record. Can I use a RecordProtocol with different record types? What happens if I do that?
There is no way to specify a timeout. Are we going to rely on the context for this?
Ping could be more useful if it returned some metrics like roundtrip latency.
We could improve the semantics of Ping and make it a useful connectivity check. Since it's not currently used in IPFS we could define it so that a successful ping means the node is a dht node that can be used for routing. In other words if Ping succeeds then we can assume that the node will respond to FindNode requests. The implementation of Ping in libp2p-kad-dht may actually do a Dial followed by a FIND_NODE, just like we do with a connectivity check.

iand commented 10 months ago

There is no way to specify how many closer nodes are wanted. Is this going to be a parameter that the implementation of RoutingProtocol has pre-configured? It seems that this should be something that go-kademlia will be managing to allow different types of DHT protocol to be constructed.

iand commented 10 months ago

There is no way to limit the number of records returned by Get. Suppose I just want 1 or the remote node has tens of thousands? Also, is it actually the case that a node will have multiple records associated with a key? Won't there be just 1?

iand commented 10 months ago

Put should specify the key for the record

iand commented 10 months ago

An alternate proposal that would allow extensible configuration:

// RoutingProtocol defines the methods necessary for routing in a network.
type RoutingProtocol[K Key[K], N NodeID[K], A Address[A]] interface {
    // FindNode sends a message to a node requesting that it return its closest nodes to the target key.
    FindNode(ctx context.Context, to N, target K, cfg *RequestConfig) ([]NodeInfo[K, A], error)

    // Ping sends a message to a node to perform a liveness check.
    Ping(ctx context.Context, to N, cfg *RequestConfig) error
}

// RecordProtocol defines the methods for storing and retrieving records in a network.
type RecordProtocol[K Key[K], N NodeID[K], A Address[A]] interface {
    // Get sends a message to a node requesting that it return the records associated with the target key.
    // The node may also return a list of closer nodes that may also store records for that key.
    Get(ctx context.Context, to N, target K, cfg *RequestConfig) ([]Record, []NodeInfo[K, A], error)

    // Put sends a message to a node requesting that it store the record.
    Put(ctx context.Context, to N, record Record, cfg *RequestConfig) error
}

// RequestConfig defines common configuration options for protocol requests.
type RequestConfig struct {
    // Deadline sets the time by which the request must be completed.
    Deadline time.Time

    // could be extended in future to set a request trace to measure latency profile
}

Note I used Deadline rather then Timeout. In general deadlines should be preferred, especially when we are scheduling actions that may not be executed immediately. The caller may have its own deadline so if there is a delay starting the request then the timeout won't correspond with the caller's expectations.

guillaumemichel commented 10 months ago

I like the idea of having a single GET (and also PUT) interface to handle all kinds of records (peer, provider, IPNS). See this issue.

go-kademlia only needs the request to contain the target kademlia key (and termination condition, which can be a closure), and the response needs to have a closer_peers field (so that the routing table can be updated). All the rest should be handled by the DHT implementation and has no impact on go-kademlia.

dennis-tra commented 10 months ago

The RecordProtocol isn't parameterized by the Record. Can I use a RecordProtocol with different record types? What happens if I do that?

My thinking was that you'd need to implement the RecordProtocol multiple times for any type of record you're interested in. Because the protocol isn't parameterized by the Record you can call, e.g., a Put with a wrong record type which would result in an error. Is that what you meant by "isn't parameterized by the Record"?

There is no way to specify a timeout. Are we going to rely on the context for this?

I would advocate for using context deadlines and not introducing a new mechanism to configure timeouts etc.

Ping could be more useful if it returned some metrics like roundtrip latency.

👍 though the Ping protocol is controversial in the libp2p case: https://github.com/libp2p/go-libp2p-kad-dht/pull/810#issuecomment-1422856193. In the general go-kademlia context, it makes sense IMO.

We could improve the semantics of Ping and make it a useful connectivity check. Since it's not currently used in IPFS we could define it so that a successful ping means the node is a dht node that can be used for routing. In other words if Ping succeeds then we can assume that the node will respond to FindNode requests. The implementation of Ping in libp2p-kad-dht may actually do a Dial followed by a FIND_NODE, just like we do with a connectivity check.

That's cool and addresses my comment above 👍

There is no way to specify how many closer nodes are wanted. Is this going to be a parameter that the implementation of RoutingProtocol has pre-configured? It seems that this should be something that go-kademlia will be managing to allow different types of DHT protocol to be constructed.

Accoding to the original Kademlia paper, the number of closer nodes is a network wide parameter. I don't think this should be configurable. If go-kademlia -> go-dht then it might make sense - though, I'd need to think about it a bit more.

There is no way to limit the number of records returned by Get. Suppose I just want 1 or the remote node has tens of thousands? Also, is it actually the case that a node will have multiple records associated with a key? Won't there be just 1?

Shouldn't this rather be part of the protocol like a limit field in the message? A node can have many provider records for a key.

Put should specify the key for the record

In go-libp2p-kad-dht the key is on the record. So, alternatively, we could require a Key() kad.Key method on the Record interface? No strong opinion, it'd also be fine to have a key/target parameter on the Put method - I'd just prefer the function signature to have as few parameters as possible.

guillaumemichel commented 10 months ago

IMO this interface should be defined on the query implementation, and NOT generally for go-kademlia. Different query implementations may need different interfaces. In the end, go-libp2p-kad-dht is a user of a query implementation living in go-kademlia, itself relying on other components of go-kademlia such as a routing table, key format etc.

However, when we will implement a more optimized query mechanism, the interface may have different needs.

A concrete example of another query mechanism, is using a low concurrency parameter for the first hops of the lookup, and a larger concurrency parameter for the last hops, optimizing network usage.

Another example is the use of soft timeouts, we could have a query mechanism with a concurrency C parameter, that will have C messages inflight. Each request is configured with a X ms timeout. If a remote peer fails to answer within Y ms (with Y < X), we keep waiting for the answer until X but send another request, going above the concurrency factor of C. This allows to cut the dependency on slower peers, and accelerate the lookup resolution.

With both examples, it seems to me like they could all share the same interface described in this PR. This interface could be the generic one (that all query mechanisms can/must implement). However, some query mechanisms may use a different interface because they require custom parameters, and their users will never use the generic GET and PUT.

guillaumemichel commented 10 months ago

There is no way to specify a timeout. Are we going to rely on the context for this?

As a first query implementation, it is probably easier to define the timeout along with other constants such as concurrency parameter, number of closer peers to lookup in the routing table when starting a lookup, etc.

probe-lab / go-kademlia

Protocol interface refinement #98