Open derekchiang opened 7 years ago
I would also like to add that we can also keep the Send()
and Receive()
API around so that flooding is still supported. In that case, each node would need to maintain two messages queues, one for Send/Receive
and the other for Broadcast/SetMessageHandler
. Having only one queue would be impossible because Receive
wouldn't know to handle the broadcast header and vice versa.
Currently, messages sent to the
Send()
channel are sent to every peer that the node connects to. This API allows for a very simple "flooding" scheme: to send a message to every node in the network, nodes simply forward whatever messages they receive. This pattern is implemented in theChirp
example.Flooding, however, is inefficient in terms of the amount of messages generated, since each node can receive the same message multiple times from different neighbors. Since the nodes are connected in a Kademlia topology, we can in fact implement "structured broadcast", wherein nodes only forward messages to selected neighbors. When implemented correctly, structured broadcast can offer similar guarantees to flooding in terms of node coverage, while reducing the total number of messages generated and potentially improving latency. I've linked a couple relevant papers at the end of this issue.
For structured broadcast to work, however, this library needs to expose an API that's more powerful than
Send()
andReceive()
. @enzoh made a good point that the decision on whether a message should be forwarded has to be made at the application level, which complicates the matter. My current thinking is along these lines:Send()
andReceive()
API. Instead we add:Basically, the
SetMessageHandler
API takes a callback that's invoked whenever a message is received. If the message is "valid" (according to the application), then the callback returnstrue
, in which case the message is forwarded. Otherwise, the callback returnsfalse
and the message is not forwarded.Broadcast()
is used to initiate a structured broadcast. That is, whenBroadcast()
is called, the caller understands that the message will be eventually delivered to the entire network (as opposed to just the neighbors). To achieve that, we wrap every message in a header that contains metadata related to the structured broadcast (e.g. the range of the neighbors that the next hop should forward to).When a node receives a message (which is always wrapped in a broadcast header), it unwraps the message and reads the broadcast metadata. It then updates the metadata to generate a new header, which further restricts the range of neighbors that should be forwarded to in the next hop. It then invokes the callback with the vanilla message. If the callback returns true, it forwards the message with the new header to a selected set of neighbors (as specified in the previous header).
To reiterate:
Broadcast()
is only ever invoked at the originator of the broadcast.Broadcast()
wraps the message in a "broadcast header".true
, the node updates the header and forwards the message to a selected set of neighbors.Relevant papers:
Efficient Broadcast in Structured P2P Networks Debunking some myths about structured and unstructured overlays Evaluation of alternatives for the broadcast operation in Kademlia under churn