Utilizing Kademlia for Faster Block Propagation

RSenApps commented 5 years ago

I have an idea for improving the throughput and latency of propagation of blocks and transactions throughout the network. As I understand this has a large effect on the overall performance of the system. I research compilers rather than distributed systems at MIT (though I do have some experience building blockchain systems) so please let me know if something I say is completely off.

As I understand it, right now Cardano uses Kademlia for establishing a peer to peer network with a small diameter. However to broadcast a block, a node must send an INV message to all peers to inform them that there is a new block and then if a peer has not already heard about this block from another peer it must send a REQ message to this node. Finally, this REQ message is answered with the larger DATA message. These INV and REQ messages are relatively cheap, but there are a large number of INV messages sent (# of peers per node * # of nodes). For a 1M node network with K=7, this means that each node receives approximately 140 INV messages per block.

In the following scheme, INV messages can be removed entirely and the minimum number of DATA messages will still be sent. This also means that each propagation step skips a network round trip for the INV/REQ part of the protocol. More importantly, it also ensures that every node sends a similar number of DATA messages, which keeps the propagation tree balanced. This is not true if you send INV messages to all of your peers and allow them to select the first node they hear from.

This scheme is much clearer when considering a DHT implementation with a circular hash space such as Chord (which Kademlia is based off of), but I believe this extends to Kademlia's XOR distance metric as well since it satisfies the triangle inequality.

Basically the idea is to propagate the block according to the tree rooted at the origin of the block or transaction. In effect this is using the ability for Kademlia to achieve a log(n) table lookup to instead broadcast information with log(n) latency. The root node sends messages to its F (fan-out) peers of farthest distance that are in different k-buckets. Each of these messages include the block or transaction as well as a distance (relative to the root node). The peer that receives this message is responsible for distributing the message to all nodes that are further than itself from the root node, but closer than the distance passed. This peer will tend to have more nodes closer to itself (due to the k-bucketing) and therefore can fan out this message among these nodes.

As each node is given a distance range to fan out to that is separate from any other node's range, it ensures that no duplicate DATA messages are sent. The closest peer to the root node must also be instructed to propagate to all nodes between itself and the root node. As each node is guaranteed to know at least one node in each of its subtrees, each node is guaranteed to eventually receive the message (just like how in Kademlia every table lookup is guaranteed to eventually succeed).

For an example, I will consider a consistent hashing space like Chord since it is more easily understood. Consider a hash space from 0-99 with nodes numbered as such. I am assuming that the space is full of 100 nodes (though this extends to when it is not full). If node 0 creates a new block, it sends the following messages to the following nodes: To node 50: Send this message to nodes 51-99 To node 25: Send this message to nodes 26-49 To node 12: Send this message to nodes 13-24 To node 6: Send this message to nodes 7-11 To node 3: Send this message to nodes 4-5 To node 1: Send this message to node 2

Node 50 then repeats this procedure, sending to nodes 75, 62, 56, 53, 51. Node 25 sends to nodes 37, 31, 28, 26. And this continues until all nodes have received the message.

There are some additional complexities here when you consider malicious peers that are incentivized to not propagate your message. However, I believe that if good peers were to randomly check if peers downstream of a potential malicious peer eventually heard their message, you could then track backwards to determine which node failed to propagate the network and then ban this node from the network. Constructing a scheme like this might actually ensure stronger guarantees than the current approach since in the current network a malicious peer that only transmits its own mined blocks can not be discovered. Of course before implementation, it would be necessary to formally prove what these guarantees are, but I wanted to post this proposal here to raise some initial discussion and get feedback on it.

Some questions:

Is something like this already being done in the blockchain space?
Does this idea generally sound feasible / is there something that I am missing?
If this idea was to be implemented into Cardano what steps would need to be taken?
Will the improvements I mention above result in a noticeable change in performance or is there another bottleneck in Cardano?

RSenApps commented 5 years ago

Is there a better place to post this suggestion? Would appreciate any ideas from those who are more familiar with how the cardano development effort is structured.

karknu commented 5 years ago

At the moment Cardano is running in a federated setting and is not utilizing Kademlia . New network related work for Cardano's decentralized release is going on over at https://github.com/input-output-hk/ouroboros-network . Work on peer-discovery hasn't been pushed but there are no plans to use Kademlia for it.

RSenApps commented 5 years ago

Ok, thank you for the information so I assume that this document https://cardanodocs.com/technical/protocols/p2p/ is out of date then? Do you have a link where I could read more on the proposed peer discovery method? As long as peer discovery is still done in a structured method then I think an adapted form of this scheme should still work.

karknu commented 5 years ago

Yes, the document at https://cardanodocs.com/technical/protocols/p2p/ is out dated. The peer discovery method hasn't been made public yet.

RSenApps commented 5 years ago

Ok, thanks for letting me know. Once the protocol is made public do you mind linking it here (also do you know approximately when I should expect it to be made public)? I do feel like there is a pretty significant performance benefit to structuring the network as I suggested above, but would like to compare to whatever peer discovery mechanism ends up getting used.

karknu commented 5 years ago

I will make a post here when it is pushed.

input-output-hk / cardano-sl

Utilizing Kademlia for Faster Block Propagation #4012