libp2p / go-libp2p-kad-dht

A Kademlia DHT implementation on go-libp2p
https://github.com/libp2p/specs/tree/master/kad-dht
MIT License
517 stars 221 forks source link

new version of the stream handler that uses a compact representation per provide #871

Open Jorropo opened 11 months ago

Jorropo commented 11 months ago

The current protobuf definition is backward, it allow to define multiple peer id per key and multiple addresses per peer id. We also enforce that in the server handler.

This means that for 1 announcement we may send a few times more data just in multiaddr. Repeated over millions of CID this is a very wastefull use of resources.

We should create a new version of the stream handler which does not enforce that, instead of maddrs would be magically exchanged OOB (libp2p identify). This will help because it is not hard to have CIDs you host * K / total dht servers to be multiple orders of magnitude over 1, so the protocol should orient itself to batching CIDs or multihashes. We could even have dedicated stream handler for ADD_PROVIDE and drop protobuf, instead it would be a pure stream of multihashes but this is maybe overkill.

guillaumemichel commented 11 months ago

Yes it makes sense! We don't need to include addresses in the message, because they are already known by the remote peer. Concerning the peer id, we don't need to include it in the message as long as it isn't allowed to advertise provider records for other peers than oneself. Allowing delegated provides is anyway a protocol change, no matter the wire format, because it is blocked by the servers.


I like the idea of having the following protobuf format for the put message:

Method: PUT_PROVIDER
Keys: CID, CID, CID, ...

Both the Reprovide Sweep and fullRT should be able to benefit from it, rather than sending PUT requests back to back.

We could use a new protocol ID (e.g 1.1.0 or 2.0.0) in order to start benefiting from this new format. If we decide to use a new protocol, we should also identify other wire changes (and light protocol changes) that we may want to add (such as mandatory signed peer records). This discussion should certainly continue at libp2p/specs it is about protocol changes.