ethereum / devp2p

Ethereum peer-to-peer networking specifications
985 stars 275 forks source link

Consider alternatives to XOR distance metric #52

Open polarker opened 5 years ago

polarker commented 5 years ago

As far as I know, Ethereum uses a variant of Kademlia protocol. The distance of two node is based on the common prefixes of node ids (hashed actually).

In this way, the whole space of node ids are divided into two branches, one branch ids starting with 0 and another branch ids starting with 1. In theory, the nodes in branch 0 have near neighbors only from branch 0 and nodes in branch 1 have near neighbors from branch 1. Therefore, there is no neighbor connections between branch 0 and branch 1. In practice, bootstrap nodes have eased this problem probably.

I found this issue when I was designing my own blockchain algorithm/protocol. My fix is pretty simple as well, replacing xor metric by hamming distance, i.e. changing from distance = id1 xor id2 to distance = sum bits of(id1 xor id2). With hamming distance, the network has the same diameter as xor metric, but not partitioned.

P.S. I post this on ethresear.ch yesterday, but maybe here is a better place for it.

ChihChengLiang commented 5 years ago

Let me post the reference to the research post here

fjl commented 5 years ago

As Jannik explained on ethresear.ch, Kademlia is not used to determine connections between nodes. The node discovery DHT is based on Kademlia, but the way nodes talk to each other in the DHT is unrelated to actual peer connections.

Your hypothesis is not correct because bucket zero in the table contains nodes from both 'branch 0' and 'branch 1'.

fjl commented 5 years ago

I appreciate the tip to try hamming distance though. We need routing to work and I'm not sure it works properly with just hamming distance, but it's still a good idea to evaluate whether another distance metric would work better than xor.

fjl commented 4 years ago

It appears that Hamming distance based DHTs have been attempted in the past, the main goal being fast similarity search: http://ce-resd.facom.ufms.br/sbrc/2012/ST4_2.pdf