Open dennis-tra opened 11 months ago
I quite like the Endpoint abstraction as the component that knows how to send messages to nodes. I think it could be refined and brought down to a minimal SendMessage method plus a FindNode method, which I think is such a fundamental operation that it warrants promotion to a method.
The current Message
abstraction works quite naturally with queries. To put a record to N nodes you
Message
(currently you also need a protocol id but it can be removed)Endpoint
. The Endpoint
interprets the message (via a type assertion currently but Endpoint could be parameterised by message type) and sends it to the node, thereby putting the record.The query only needs an opaque Message
and an Endpoint
. It doesn't know what operation it is performing. The operation might not be record-oriented, it could be a ping or hello style operation.
I think to switch to using Protocol
the query would need to know the Protocol
, which method is being performed Get
or Put
and the Record
if it's a Put
method. That complicates the query interface quite a lot.
Possibly that could be solved by passing the query a function that encapsulates the call to Get
or Put
with the associated record, but then that looks a lot like simply passing a Message
implementation(which itself could simply be a function).
Another thought is that, as shown, the Protocol
methods are synchronous whereas the Endpoint
is asynchronous (via the callback). This means the caller needs to block, probably in a goroutine. This complicates the state machine and scheduler logic which is designed to dispatch a request and deal with the response as a later event.
I am not in favor of replacing the Endpoint
abstraction with a Protocol
abstraction. The only purpose of endpoint is to send and receive messages or even just bytes. It is used to connect peers to each other, and shouldn't know anything about protocols or even DHT specifics. Using specific Protocol
s as described in this PR would lead to code duplication, because if we take IPFS as an example, the FindNode
, GetProviders
, GetValue
, PutProvider
and PutValue
would need to create a new libp2p stream and write into it. The only difference in sending a FindNode
or GetProviders
message to a remote peer is the body of the message.
I think it could be refined and brought down to a minimal SendMessage method plus a FindNode method, which I think is such a fundamental operation that it warrants promotion to a method.
I agree that the Endpoint
interface should be refined and brought down to the minimal interface. However, I don't think that a FindNode
method is required. FindNode
should simply be a user of SendMessage
. FindNode
message format depends on the wire format, which is protocol specific, and the Endpoint
should be generic and not protocol specific.
In addition to SendMessage
, we may want to define a way to receive messages. We don't need to define a way to handle these messages, because this part is protocol specific, but being able to send a receive bytes (or a message) seems to be the minimum. The ReceiveMessage
method should only be required by nodes running in server mode. So we could have 2 interfaces (e.g Endpoint
and ServerEndpoint
implementing Endpoint
and the receive method).
The benefit of a minimal Message
interface (in opposition to just bytes) is that it can contain a CloserPeers
field that can be read by the endpoint, so that it can keep track of any potential closer peers. Alternatively, if we don't want to have a Message
interface and just want to provide bytes to send over the network, Endpoint
needs to have an interface to add (and remove) peers, e.g AddPeers
, that the users of SendMessage
or ReceiveMessage
could call.
This last method (AddPeers
) could also be Endpoint
implementation specific, as long as it is fine for a query to rely on a single Endpoint
implementation. Or there could be multiple classes of Endpoint
, implementing different features, but this would add complexity (I think that it is currently the case lol).
FindNode should simply be a user of SendMessage. FindNode message format depends on the wire format, which is protocol specific, and the Endpoint should be generic and not protocol specific.
Not at all. The endpoint simply translates FindNode into an appropriate message and calls SendMessage. The advantage is the caller doesn't need to know any specifics of the protocol/wire format for finding closer nodes. Currently if we want to perform a connectivity check on a peer as part of a background process we have to inject a way to construct the appropriate find node message. Moving that responsibility to the endpoint instead seems appropriate for such a fundamental operation.
IMO the message format should be a system parameter, and not part of the Endpoint
nor the Query
modules. It should be defined by the DHT implementation (e.g go-libp2p-kad-dht).
Otherwise, it means that a different Endpoint
implementation is required for each DHT protocol, and generic transport endpoints (e.g TCP, libp2p, etc.) cannot be reused from one protocol to another. go-kademlia modules are expected to be generic, and in this case Endpoint
would be protocol dependent.
Endpoint
should only care about getting a message/bytes from one node to another (and possibly learning how to reach new peers). If we need any additional capability, we should use another interface (DhtEndpoint
?) depending on an Endpoint
and on a wire format.
In simplequery
, the query takes as argument the key that is being looked up, and the lookup request message. The query simply passes down the lookup request message to the endpoint, that simply marshalls it and write it on the wire. This approach is generic, because the Query
and the Endpoint
implementation don't depend on any wire format, but require a message as input.
If providing the request message to every query isn't an option (i.e because all queries use the same format), the best ways seems to go with a QueryBuilder
that can build a query only using minimal information, depending on the details of the protocol in use.
Another argument against having a FindNode
method defined on Endpoint
is that FindNode
is protocol specific, and may need different parameters depending on the DHT protocol. As an example, the protobuf format that we use defines a clusterLevelRaw
field (that isn't used AFAIK), but when using similar features, we may need additional parameters to FindNode
than simply a peer.ID
or Key256
, and we cannot predict which parameters we need to include in the interface.
@guillaumemichel I think we are talking past one another. The implementation of an Endpoint belongs elsewhere but the Endpoint interface belongs in go-kademlia and deals only in types like kad.Request
and kad.Response
. We're saying the same thing: the endpoint cares only about getting the bytes to the right node and receiving the bytes.
The libp2p implementation of Endpoint does that by using protobuf messages with (for kad/1.0.0) the correct message type set in the protobuf. Those messages conforms to the kad.Request and kad.Response interfaces. go-kademlia knows nothing about the specific details of the message implementation.
However, I am saying that the find node operation is of fundamental utility for building any dht. So, rather than polluting go-kademlia with the details of how to construct a find node request we should let the endpoint implementation do it.
The libp2p endpoint implementation of the FindNode method (assuming kad/1.0.0 libp2p protocol) would construct a protobuf with the message type of FIND_NODE, send it to the node, and return the response to go-kademlia as a kad.Response.
In simplequery, the query takes as argument the key that is being looked up, and the lookup request message. The query simply passes down the lookup request message to the endpoint, that simply marshalls it and write it on the wire.
This is exactly how the coordinator and query state machine work. They are not dependent on message implementations. See https://github.com/plprobelab/go-kademlia/blob/main/coord/coordinator.go#L443
But, for a process inside go-kademlia that is not user-initiated and wants to explore the network, e.g. during refresh, we currently have to inject a way to create a suitable kad.Request for the query the process wants to run. I want to eliminate that injection and allow the endpoint to create the message for us and send it.
We're saying the same thing: the endpoint cares only about getting the bytes to the right node and receiving the bytes.
:+1:
IMO we should separate in distinct modules (1) the message format (Protocol
module?) and (2) the transport endpoint (Endpoint
), otherwise the Endpoint
implementation isn't just transporting bytes, it is also dependent on a protocol.
The libp2p implementation of Endpoint does that by using protobuf messages with (for kad/1.0.0) the correct message type set in the protobuf
If the message format is packed with the transport endpoint, it means that the day that someone wants to use another protocol than kad/1.0.0 along with libp2p, they have to create a new Endpoint
implementation rather than just creating another message format.
So, rather than polluting go-kademlia with the details of how to construct a find node request we should let the endpoint implementation do it.
IMO this isn't the role of the Endpoint
implementation, because it isn't about carrying bytes.
But, for a process inside go-kademlia that is not user-initiated and wants to explore the network, e.g. during refresh, we currently have to inject a way to create a suitable kad.Request for the query the process wants to run. I want to eliminate that injection and allow the endpoint to create the message for us and send it.
I now understand better why you feel the need of having a FIND_NODE
(with protocol specific message format) packed with the Endpoint
. It would allow the routing table refresh module to utilize FIND_NODE
to update itself.
It think that it would make more sense to have the RoutingTableRefresh
implementation distinct from the RoutingTable
implementation (it also avoids circular dependencies). I suggest to add a Protocol
module, to be implemented by the DHT implementation, and defining message format and protocol specifics.
The RoutingTableRefresh
implementation should take a Protocol
as input. RoutingTableRefresh
would depend on the RoutingTable
that it aims to refresh, and on a Query
mechanism. It would pass down the Protocol
(or simply the message format) to the Query
mechanism that would in turn pass it down to the Endpoint
. This way, the Endpoint
could also be used by other DHT implementations (e.g bittorrent/1.0.0, chord/1.0.0, etc.).
It think that it would make more sense to have the RoutingTableRefresh implementation distinct from the RoutingTable implementation (it also avoids circular dependencies). I suggest to add a Protocol module, to be implemented by the DHT implementation, and defining message format and protocol specifics.
This is already the case. We don't have refresh/probe yet but the same design is already there for including candidates in the table https://github.com/plprobelab/go-kademlia/blob/main/routing/include.go#L85 and it responds with a general event that instructs the coordinator to send a message (via the endpoint) to issue a find node request (https://github.com/plprobelab/go-kademlia/blob/main/routing/include.go#L176)
To be more precise, I am arguing for a GetClosestNodes function that sends a message to a node asking it for its list of closest nodes to a specified kademlia key.The return value would be a list of NodeInfos (NodeID + Addresses)
Endpoint
implementation isn't aware of wire format (otherwise it isn't generic).
1.
func TableRefresh(rt routing.Table, endpoint Endpoint) {
// ...
for _, bucket := rt.buckets {
// ...
// gets the key of the peer to refresh
nodeIds, addresses := endpoint.GetClosestNodes(node, key)
// ...
// updates routing table
}
}
func (endpoint *EndpointImpl) GetClosestNodes(node NodeID, key kad.Key) ([]NodeID, []Address) {
req := endpoint.FindNodeRequest(key) // wire format dependent on the endpoint implementation
resp := endpoint.SendReqGetResp(node, req) // response would come back in the only wire format supported by the endpoint
return resp.GetNodeIDs, resp.GetAddresses
}
I don't like endpoint.GetClosestNodes(key)
, because Endpoint
implementation should not infer the wire format. It should be able to send any message.
IMO the following would be more adapted as it allows Endpoint
implementations to remain generic.
2.
func TableRefresh(rt routing.Table, endpoint Endpoint, wireFormat FormatType) {
// ...
for _, bucket := rt.buckets {
// ...
// gets the key of the peer to refresh
nodeIds, addresses := endpoint.GetCloestNodes(node, key, wireFormat)
// ...
// updates routing table
}
}
func (endpoint *EndpointImpl) GetClosestNodes(node NodeID, key kad.Key, wireFormat FormatType) ([]NodeID, []Address) {
req := wireFormat.FindNodeReq(key) // wire format not dependent on the endpoint implementation
response := endpoint.SendReqGetResp(node, req)
resp := response.(wireFormat.FindNodeResp)
return resp.GetNodeIDs, resp.GetAddresses
}
Then, we can even move the message generation to the caller to keep the endpoint minimal.
3.
func TableRefresh(rt routing.Table, endpoint Endpoint, wireFormat FormatType) {
// ...
for _, bucket := rt.buckets {
// ...
// gets the key of the peer to refresh
req := wireFormat.FindNodeReq(key)
response := endpoint.SendReqGetResp(node, req)
resp := response.(wireFormat.FindNodeResp)
nodeIds := resp.GetNodeIDs
addresses := resp.GetAddresses
// ...
// updates routing table
}
}
A method on the FormatType
implementation may be more relevant in order to simplify the process.
4.
func (wireFormat FormatType) FindNodeReq(endpoint Endpoint, node NodeID, key kad.Key) FindNodeResp {
req := wireFormat.FindNodeReq(key)
response := endpoint.SendReqGetResp(node, req)
return response.(wireFormat.FindNodeResp)
}
func TableRefresh(rt routing.Table, endpoint Endpoint, wireFormat FormatType) {
// ...
for _, bucket := rt.buckets {
// ...
// gets the key of the peer to refresh
nodeIds, addresses := wireFormat.FindNodeReq(endpoint, node, key)
// ...
// updates routing table
}
}
It feels better to me to have wire format handling parsing in the FormatType
(to be renamed WireFormat
, ProtocolFormat
, ?), rather than in the Endpoint
.
I find that 4. is the most elegant way to proceed, but I guess that 2. would also be fine.
Let me know what you think @iand
Endpoint implementation isn't aware of wire format (otherwise it isn't generic).
I think I must be misunderstanding something. The IPFS Endpoint implementation in go-kademlia/ipfs absolutely knows about the wire format. It creates protobuf messages that are sent on streams.
Alright!
I think that a reorganization created confusion about it. Currently there is a massive libp2p
package containing the libp2p Endpoint
implementation (libp2pendpoint.go), as well as the specific go-libp2p-kad-dht
message format (message.proto and helpers.go, which is only an helper for the go-libp2p-dht-kad message format).
However, I originally designed them to be distinct modules (see how the repo was before it got reorganized). The libp2pendpoint module is an implementation of the Endpoint
interface. And the ipfsv1 module is the message implementation for the IPFS DHT format. You can look at the README.md in each folder for a description of what are the interfaces and implementations.
I agree that mixing multiple module implementations that aren't directly related to each other in the same package doesn't help to make it clear. We should probably (1) create subfolders within the libp2p package for each of the modules implementations, or (2) put back the modules where they were, until we migrate go-libp2p-kad-dht components to go-libp2p-kad-dht.
You can read more here about the initial idea behind the Endpoint
interface.
It creates protobuf messages that are sent on streams.
Not exactly. It takes as input a kad.Message
, then verifies that it actually implements ProtoKadMessage
. The libp2p w.WriteMsg(msg)
function (see below) takes a proto.Message as argument, marshalls it, and sends it on the stream.
So the Endpoint
implementations are just given a message, that is marshalled and sent on the wire. It is the responsibility of the caller to provide the right message. The [message
] interface and module implementations are just helpers so that the callers don't need to forge the message themselves, but can just use the message format (e.g ipfsv1
) that is associated with the DHT protocol/DHT implementation. But now the module is split between folders and is hard to track.
I am happy to discuss it over a call if it can help.
Right now, we're working with a Endpoint interfaces to abstract away the concrete communication between two nodes in the network. Here are the interfaces
As an alternative approach, I'm proposing a
Protocol
interface with the following signature:The assumption here is that in a Kademlia network any record type (provider, ipns, peer) has the two primitives
Get
andPut
. AGet
request to a remote node will always return a list of results (Records() []R
) plus a list of nodes that know more (CloserNodes() N
).In the libp2p/IPFS case a concrete implementation of the Protocol interface (only GET) could look like this:
When I start a query in go-kademlia I need to differentiate between a GET or PUT query and then pass it the specific protocol I want to use. E.g., there should be another
Protocol
interface implementation for provider records. If I want to get the provider record for a certain key, I'd start a query by enqueuing aneventAddQuery
event and pass it the protocol it should use. When go-kademlia starts the query it callsGet
on the protocol and uses the resultCloserNodes
result to guide further queries. The genericRecord
type allows type safety. A concreteProtocol
implementation would return the correct record type to the caller.A Put operation would use some kind of base-protocol that implements node discovery Get/Put operations to find the closest node to a certain key. Only the last step of storing the record would use the concrete record-specific protocol.
There are a few issues with this approach:
Record
interfaces and then when passing the record out to the user back toR
.Get
takes a Kademlia key at the moment. In IPFS land we're querying for the preimage of the Kademlia key, so there must be a way to combine both and pass it into theGet
method.I formulated the above in a hurry and have some proposals for the issues that I'll post here next week.