Open september28 opened 5 years ago
This array is just a helper for quickly comparing hexadecimal hashes. Index is the value of a xor operation on 4 consecutive bits in a pair of hashes. Value is the number of bits that differ.
For example take a look at the following hashes:
1a2b8c3d4
1a2bfc3d4
The numbers at index 4 are different: 0x8 and 0xf respectively. Their binary values are:
0x8 == 0b1000
0xf == 0b1111
The result of the xor operation will be 0x8 ^ 0xf == 0b0111
which is equal to 7. one_bits[7]
is equal to 3, which is the number of different bits in 0x8
and 0xf
. Hope this helps!
I was wondering if anyone could shed light on the one_bits array since I do not have access to the original paper upon which this library is based.
I can see that the hammingDistance function essentially calculates the bit difference between the relevant char in the two hashes to be compared. this XOR operation n1^n2 will, in theory, be in the range 0 to 15, since parseInt("0",16)=0 and parseInt("f",16)=15. So my question is that the one_bits array seems to essentially "decelerate" the hamming distance by increasing the distance by either 0,1,2,3, or 4 instead of the bit difference (0-15). Why does the algorithm do this? var one_bits = [0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4]; Interestingly, the one_bits array is non-linear, so a bit difference of 3 will increase distance by 2 whereas a bit difference of 4 will increase distance by 1...??
Thanks!