probe-lab / go-kademlia

Generic Go Kademlia implementation
Other
17 stars 4 forks source link

Protocol vs Endpoint #89

Open dennis-tra opened 11 months ago

dennis-tra commented 11 months ago

Right now, we're working with a Endpoint interfaces to abstract away the concrete communication between two nodes in the network. Here are the interfaces

// Endpoint defines how Kademlia nodes interacts with each other.
type Endpoint interface {
    // MaybeAddToPeerstore adds the given address to the peerstore if it is
    // valid and if it is not already there.
    MaybeAddToPeerstore(context.Context, kad.NodeID, time.Duration) error
    // SendRequestHandleResponse sends a request to the given peer and handles
    // the response with the given handler.
    SendRequestHandleResponse(context.Context, address.ProtocolID, kad.NodeID,
        message.MinKadMessage, message.MinKadMessage, time.Duration, ResponseHandlerFn)

    // KadKey returns the KadKey of the local node.
    KadKey() key.KadKey
    // NetworkAddress returns the network address of the given peer (if known).
    NetworkAddress(kad.NodeID) (kad.NodeID, error)
}

// ServerEndpoint is a Kademlia endpoint that can handle requests from remote
// peers.
type ServerEndpoint interface {
    Endpoint
    // AddRequestHandler registers a handler for a given protocol ID.
    AddRequestHandler(address.ProtocolID, RequestHandlerFn)
    // RemoveRequestHandler removes a handler for a given protocol ID.
    RemoveRequestHandler(address.ProtocolID)
}

As an alternative approach, I'm proposing a Protocol interface with the following signature:

type Protocol[K Key[K], N NodeID[K], R Record] interface {
    Get(ctx context.Context, to N, target K) (Response[K, N, R], error)
    Put(ctx context.Context, to N, record R) error
}

// Required interfaces from above

type Record interface {
    // ...
}

type Response[K Key[K], N NodeID[K], R Record] interface {
    Records() []R
    CloserNodes() []N
}

The assumption here is that in a Kademlia network any record type (provider, ipns, peer) has the two primitives Get and Put. A Get request to a remote node will always return a list of results (Records() []R) plus a list of nodes that know more (CloserNodes() N).

In the libp2p/IPFS case a concrete implementation of the Protocol interface (only GET) could look like this:


type ProtocolFindNode struct {
    Host host.Host
}

var _ kad.Protocol[key.Key256, PeerID, ma.Multiaddr, peer.AddrInfo] = ProtocolFindNode{}

func (p ProtocolFindNode) Get(ctx context.Context, to PeerID, target key.Key256) (kad.Response[key.Key256, PeerID, ma.Multiaddr, peer.AddrInfo], error) {
    if err := p.Host.Connect(ctx, peer.AddrInfo{ID: to.ID}); err != nil {
        return nil, err
    }

    // start talking the kademlia protocol
    s, err := p.Host.NewStream(ctx, to.ID, "/ipfs/kad/1.0.0")
    if err != nil {
        return nil, err
    }
    defer s.Close()

    // craft request
    req := &Message{
        Type: Message_FIND_NODE,
        Key:  target, // TODO: this is wrong. needs to be the preimage
    }

    // send request
    err = WriteMsg(s, req)
    if err != nil {
        return nil, err
    }

    // wait for response
    resp := &Message{}
    err = ReadMsg(s, resp)
    if err != nil {
        return nil, err
    }

    // wrap response in a type that implements the Response interface. Perhaps Message could implement it directly?
    fnm := FindNodeMessage{msg: resp}

        // add peers to peer store
    for _, cn := range fnm.CloserNodes() {
        for _, a := range cn.Addresses() {
            p.Host.Peerstore().AddAddr(cn.ID().ID, a, /* some ttl */)
        }
    }

    return &fnm, nil
}

func (p ProtocolFindNode) Put(ctx context.Context, to PeerID, record peer.AddrInfo) error {
        // just a put request
}

When I start a query in go-kademlia I need to differentiate between a GET or PUT query and then pass it the specific protocol I want to use. E.g., there should be another Protocol interface implementation for provider records. If I want to get the provider record for a certain key, I'd start a query by enqueuing an eventAddQuery event and pass it the protocol it should use. When go-kademlia starts the query it calls Get on the protocol and uses the result CloserNodes result to guide further queries. The generic Record type allows type safety. A concrete Protocol implementation would return the correct record type to the caller.

A Put operation would use some kind of base-protocol that implements node discovery Get/Put operations to find the closest node to a certain key. Only the last step of storing the record would use the concrete record-specific protocol.

There are a few issues with this approach:

  1. Passing the generic record type through the hierarchical state machine to maintain type safety doesn't work because the type switches that we're using only match on concrete types. So we need to down cast to Record interfaces and then when passing the record out to the user back to R.
  2. Get takes a Kademlia key at the moment. In IPFS land we're querying for the preimage of the Kademlia key, so there must be a way to combine both and pass it into the Get method.

I formulated the above in a hurry and have some proposals for the issues that I'll post here next week.

iand commented 11 months ago

I quite like the Endpoint abstraction as the component that knows how to send messages to nodes. I think it could be refined and brought down to a minimal SendMessage method plus a FindNode method, which I think is such a fundamental operation that it warrants promotion to a method.

The current Message abstraction works quite naturally with queries. To put a record to N nodes you

  1. start a query, passing the put operation as a Message (currently you also need a protocol id but it can be removed)
  2. the query sends the message to the closest node it knows about via the Endpoint. The Endpoint interprets the message (via a type assertion currently but Endpoint could be parameterised by message type) and sends it to the node, thereby putting the record.
  3. the query receives a response including closer nodes which are added to its node list.
  4. go to 2 and repeat until we have put to the N closest nodes.

The query only needs an opaque Message and an Endpoint. It doesn't know what operation it is performing. The operation might not be record-oriented, it could be a ping or hello style operation.

I think to switch to using Protocol the query would need to know the Protocol, which method is being performed Get or Put and the Record if it's a Put method. That complicates the query interface quite a lot.

Possibly that could be solved by passing the query a function that encapsulates the call to Get or Put with the associated record, but then that looks a lot like simply passing a Message implementation(which itself could simply be a function).

Another thought is that, as shown, the Protocol methods are synchronous whereas the Endpoint is asynchronous (via the callback). This means the caller needs to block, probably in a goroutine. This complicates the state machine and scheduler logic which is designed to dispatch a request and deal with the response as a later event.

guillaumemichel commented 10 months ago

I am not in favor of replacing the Endpoint abstraction with a Protocol abstraction. The only purpose of endpoint is to send and receive messages or even just bytes. It is used to connect peers to each other, and shouldn't know anything about protocols or even DHT specifics. Using specific Protocols as described in this PR would lead to code duplication, because if we take IPFS as an example, the FindNode, GetProviders, GetValue, PutProvider and PutValue would need to create a new libp2p stream and write into it. The only difference in sending a FindNode or GetProviders message to a remote peer is the body of the message.

I think it could be refined and brought down to a minimal SendMessage method plus a FindNode method, which I think is such a fundamental operation that it warrants promotion to a method.

I agree that the Endpoint interface should be refined and brought down to the minimal interface. However, I don't think that a FindNode method is required. FindNode should simply be a user of SendMessage. FindNode message format depends on the wire format, which is protocol specific, and the Endpoint should be generic and not protocol specific.

In addition to SendMessage, we may want to define a way to receive messages. We don't need to define a way to handle these messages, because this part is protocol specific, but being able to send a receive bytes (or a message) seems to be the minimum. The ReceiveMessage method should only be required by nodes running in server mode. So we could have 2 interfaces (e.g Endpoint and ServerEndpoint implementing Endpoint and the receive method).

The benefit of a minimal Message interface (in opposition to just bytes) is that it can contain a CloserPeers field that can be read by the endpoint, so that it can keep track of any potential closer peers. Alternatively, if we don't want to have a Message interface and just want to provide bytes to send over the network, Endpoint needs to have an interface to add (and remove) peers, e.g AddPeers, that the users of SendMessage or ReceiveMessage could call.

This last method (AddPeers) could also be Endpoint implementation specific, as long as it is fine for a query to rely on a single Endpoint implementation. Or there could be multiple classes of Endpoint, implementing different features, but this would add complexity (I think that it is currently the case lol).

iand commented 10 months ago

FindNode should simply be a user of SendMessage. FindNode message format depends on the wire format, which is protocol specific, and the Endpoint should be generic and not protocol specific.

Not at all. The endpoint simply translates FindNode into an appropriate message and calls SendMessage. The advantage is the caller doesn't need to know any specifics of the protocol/wire format for finding closer nodes. Currently if we want to perform a connectivity check on a peer as part of a background process we have to inject a way to construct the appropriate find node message. Moving that responsibility to the endpoint instead seems appropriate for such a fundamental operation.

guillaumemichel commented 10 months ago

IMO the message format should be a system parameter, and not part of the Endpoint nor the Query modules. It should be defined by the DHT implementation (e.g go-libp2p-kad-dht).

Otherwise, it means that a different Endpoint implementation is required for each DHT protocol, and generic transport endpoints (e.g TCP, libp2p, etc.) cannot be reused from one protocol to another. go-kademlia modules are expected to be generic, and in this case Endpoint would be protocol dependent.

Endpoint should only care about getting a message/bytes from one node to another (and possibly learning how to reach new peers). If we need any additional capability, we should use another interface (DhtEndpoint?) depending on an Endpoint and on a wire format.

In simplequery, the query takes as argument the key that is being looked up, and the lookup request message. The query simply passes down the lookup request message to the endpoint, that simply marshalls it and write it on the wire. This approach is generic, because the Query and the Endpoint implementation don't depend on any wire format, but require a message as input.

https://github.com/plprobelab/go-kademlia/blob/070b692fc703d37093ef66b60b2d2d478f5c7797/query/simplequery/query.go#L60-L62

If providing the request message to every query isn't an option (i.e because all queries use the same format), the best ways seems to go with a QueryBuilder that can build a query only using minimal information, depending on the details of the protocol in use.


Another argument against having a FindNode method defined on Endpoint is that FindNode is protocol specific, and may need different parameters depending on the DHT protocol. As an example, the protobuf format that we use defines a clusterLevelRaw field (that isn't used AFAIK), but when using similar features, we may need additional parameters to FindNode than simply a peer.ID or Key256, and we cannot predict which parameters we need to include in the interface.

https://github.com/plprobelab/go-kademlia/blob/070b692fc703d37093ef66b60b2d2d478f5c7797/libp2p/message.proto#L74-L76

iand commented 10 months ago

@guillaumemichel I think we are talking past one another. The implementation of an Endpoint belongs elsewhere but the Endpoint interface belongs in go-kademlia and deals only in types like kad.Request and kad.Response. We're saying the same thing: the endpoint cares only about getting the bytes to the right node and receiving the bytes.

The libp2p implementation of Endpoint does that by using protobuf messages with (for kad/1.0.0) the correct message type set in the protobuf. Those messages conforms to the kad.Request and kad.Response interfaces. go-kademlia knows nothing about the specific details of the message implementation.

However, I am saying that the find node operation is of fundamental utility for building any dht. So, rather than polluting go-kademlia with the details of how to construct a find node request we should let the endpoint implementation do it.

The libp2p endpoint implementation of the FindNode method (assuming kad/1.0.0 libp2p protocol) would construct a protobuf with the message type of FIND_NODE, send it to the node, and return the response to go-kademlia as a kad.Response.

In simplequery, the query takes as argument the key that is being looked up, and the lookup request message. The query simply passes down the lookup request message to the endpoint, that simply marshalls it and write it on the wire.

This is exactly how the coordinator and query state machine work. They are not dependent on message implementations. See https://github.com/plprobelab/go-kademlia/blob/main/coord/coordinator.go#L443

But, for a process inside go-kademlia that is not user-initiated and wants to explore the network, e.g. during refresh, we currently have to inject a way to create a suitable kad.Request for the query the process wants to run. I want to eliminate that injection and allow the endpoint to create the message for us and send it.

guillaumemichel commented 10 months ago

We're saying the same thing: the endpoint cares only about getting the bytes to the right node and receiving the bytes.

:+1:

IMO we should separate in distinct modules (1) the message format (Protocol module?) and (2) the transport endpoint (Endpoint), otherwise the Endpoint implementation isn't just transporting bytes, it is also dependent on a protocol.

The libp2p implementation of Endpoint does that by using protobuf messages with (for kad/1.0.0) the correct message type set in the protobuf

If the message format is packed with the transport endpoint, it means that the day that someone wants to use another protocol than kad/1.0.0 along with libp2p, they have to create a new Endpoint implementation rather than just creating another message format.

So, rather than polluting go-kademlia with the details of how to construct a find node request we should let the endpoint implementation do it.

IMO this isn't the role of the Endpoint implementation, because it isn't about carrying bytes.

But, for a process inside go-kademlia that is not user-initiated and wants to explore the network, e.g. during refresh, we currently have to inject a way to create a suitable kad.Request for the query the process wants to run. I want to eliminate that injection and allow the endpoint to create the message for us and send it.

I now understand better why you feel the need of having a FIND_NODE (with protocol specific message format) packed with the Endpoint. It would allow the routing table refresh module to utilize FIND_NODE to update itself.

It think that it would make more sense to have the RoutingTableRefresh implementation distinct from the RoutingTable implementation (it also avoids circular dependencies). I suggest to add a Protocol module, to be implemented by the DHT implementation, and defining message format and protocol specifics.

The RoutingTableRefresh implementation should take a Protocol as input. RoutingTableRefresh would depend on the RoutingTable that it aims to refresh, and on a Query mechanism. It would pass down the Protocol (or simply the message format) to the Query mechanism that would in turn pass it down to the Endpoint. This way, the Endpoint could also be used by other DHT implementations (e.g bittorrent/1.0.0, chord/1.0.0, etc.).

iand commented 10 months ago

It think that it would make more sense to have the RoutingTableRefresh implementation distinct from the RoutingTable implementation (it also avoids circular dependencies). I suggest to add a Protocol module, to be implemented by the DHT implementation, and defining message format and protocol specifics.

This is already the case. We don't have refresh/probe yet but the same design is already there for including candidates in the table https://github.com/plprobelab/go-kademlia/blob/main/routing/include.go#L85 and it responds with a general event that instructs the coordinator to send a message (via the endpoint) to issue a find node request (https://github.com/plprobelab/go-kademlia/blob/main/routing/include.go#L176)

iand commented 10 months ago

To be more precise, I am arguing for a GetClosestNodes function that sends a message to a node asking it for its list of closest nodes to a specified kademlia key.The return value would be a list of NodeInfos (NodeID + Addresses)

guillaumemichel commented 10 months ago

Endpoint implementation isn't aware of wire format (otherwise it isn't generic).

1.

func TableRefresh(rt routing.Table, endpoint Endpoint) {
    // ...
    for _, bucket := rt.buckets {
        // ...
        // gets the key of the peer to refresh
        nodeIds, addresses := endpoint.GetClosestNodes(node, key)
        // ...
        // updates routing table
    }
}

func (endpoint *EndpointImpl) GetClosestNodes(node NodeID, key kad.Key) ([]NodeID, []Address) {
    req := endpoint.FindNodeRequest(key) // wire format dependent on the endpoint implementation
    resp := endpoint.SendReqGetResp(node, req) // response would come back in the only wire format supported by the endpoint
    return resp.GetNodeIDs, resp.GetAddresses
}

I don't like endpoint.GetClosestNodes(key), because Endpoint implementation should not infer the wire format. It should be able to send any message.

IMO the following would be more adapted as it allows Endpoint implementations to remain generic.

2.

func TableRefresh(rt routing.Table, endpoint Endpoint, wireFormat FormatType) {
    // ...
    for _, bucket := rt.buckets {
        // ...
        // gets the key of the peer to refresh
        nodeIds, addresses := endpoint.GetCloestNodes(node, key, wireFormat)
        // ...
        // updates routing table
    }
}

func (endpoint *EndpointImpl) GetClosestNodes(node NodeID, key kad.Key, wireFormat FormatType) ([]NodeID, []Address) {
    req := wireFormat.FindNodeReq(key) // wire format not dependent on the endpoint implementation
    response := endpoint.SendReqGetResp(node, req)
    resp := response.(wireFormat.FindNodeResp)
    return resp.GetNodeIDs, resp.GetAddresses
}

Then, we can even move the message generation to the caller to keep the endpoint minimal.

3.

func TableRefresh(rt routing.Table, endpoint Endpoint, wireFormat FormatType) {
    // ...
    for _, bucket := rt.buckets {
        // ...
        // gets the key of the peer to refresh
        req := wireFormat.FindNodeReq(key)
        response := endpoint.SendReqGetResp(node, req)
        resp := response.(wireFormat.FindNodeResp)
        nodeIds := resp.GetNodeIDs
        addresses := resp.GetAddresses
        // ...
        // updates routing table
    }
}

A method on the FormatType implementation may be more relevant in order to simplify the process.

4.

func (wireFormat FormatType) FindNodeReq(endpoint Endpoint, node NodeID, key kad.Key) FindNodeResp {
    req := wireFormat.FindNodeReq(key)
    response := endpoint.SendReqGetResp(node, req)
    return response.(wireFormat.FindNodeResp)
}

func TableRefresh(rt routing.Table, endpoint Endpoint, wireFormat FormatType) {
    // ...
    for _, bucket := rt.buckets {
        // ...
        // gets the key of the peer to refresh
        nodeIds, addresses := wireFormat.FindNodeReq(endpoint, node, key)
        // ...
        // updates routing table
    }
}

It feels better to me to have wire format handling parsing in the FormatType (to be renamed WireFormat, ProtocolFormat, ?), rather than in the Endpoint.

I find that 4. is the most elegant way to proceed, but I guess that 2. would also be fine.

Let me know what you think @iand

iand commented 10 months ago

Endpoint implementation isn't aware of wire format (otherwise it isn't generic).

I think I must be misunderstanding something. The IPFS Endpoint implementation in go-kademlia/ipfs absolutely knows about the wire format. It creates protobuf messages that are sent on streams.

guillaumemichel commented 10 months ago

Alright!

I think that a reorganization created confusion about it. Currently there is a massive libp2p package containing the libp2p Endpoint implementation (libp2pendpoint.go), as well as the specific go-libp2p-kad-dht message format (message.proto and helpers.go, which is only an helper for the go-libp2p-dht-kad message format).

However, I originally designed them to be distinct modules (see how the repo was before it got reorganized). The libp2pendpoint module is an implementation of the Endpoint interface. And the ipfsv1 module is the message implementation for the IPFS DHT format. You can look at the README.md in each folder for a description of what are the interfaces and implementations.

I agree that mixing multiple module implementations that aren't directly related to each other in the same package doesn't help to make it clear. We should probably (1) create subfolders within the libp2p package for each of the modules implementations, or (2) put back the modules where they were, until we migrate go-libp2p-kad-dht components to go-libp2p-kad-dht.

You can read more here about the initial idea behind the Endpoint interface.

It creates protobuf messages that are sent on streams.

Not exactly. It takes as input a kad.Message, then verifies that it actually implements ProtoKadMessage. The libp2p w.WriteMsg(msg) function (see below) takes a proto.Message as argument, marshalls it, and sends it on the stream.

https://github.com/plprobelab/go-kademlia/blob/44238be8a1f611e0415e7b4c03cc04f5e833cb07/libp2p/proto_msg.go#L9-L11

https://github.com/plprobelab/go-kademlia/blob/44238be8a1f611e0415e7b4c03cc04f5e833cb07/libp2p/libp2pendpoint.go#L154-L175

https://github.com/plprobelab/go-kademlia/blob/44238be8a1f611e0415e7b4c03cc04f5e833cb07/libp2p/libp2pendpoint.go#L221

https://github.com/plprobelab/go-kademlia/blob/44238be8a1f611e0415e7b4c03cc04f5e833cb07/libp2p/util.go#L9-L12

So the Endpoint implementations are just given a message, that is marshalled and sent on the wire. It is the responsibility of the caller to provide the right message. The [message] interface and module implementations are just helpers so that the callers don't need to forge the message themselves, but can just use the message format (e.g ipfsv1) that is associated with the DHT protocol/DHT implementation. But now the module is split between folders and is hard to track.

I am happy to discuss it over a call if it can help.