3rd-Eden / node-hashring

hashring is a consistent hashing algorithm for Node.js that is compatible with libketama and python's hash_ring package
MIT License
350 stars 61 forks source link

Replica limit #44

Open tcf909 opened 7 years ago

tcf909 commented 7 years ago

I apologize if this is an ignorant question ahead of time.

Out of curiosity, during the replica creation loop: "key = hashValueHash(....)", why do you multiply by 4, this limits the number of replicas to 4 using MD5 and 16 using SHA512. From what I understand of your code, it is more performant to use replicas given the bitwise nature of the shift than new vnodes, which require rehashing a value.

Using a value of 1 would extend the replica count to 16 an 64 respectively I believe?

Again, pardon the ignorance. My testing shows early indications of this being true, I just do not know the ramifications further down this path...

Enjoy the vacation.

letiantian commented 7 years ago

And if the replicas is 5, j will reach 4 , then

key = hashValueHash(x[3 + j * 4], x[2 + j * 4], x[1 + j * 4], x[j * 4]);

key always is 0.

tcf909 commented 7 years ago

That is what I'm saying,

If you use [x + j] rather than [x + j * 4], you allow j to reach 16 using an md5 digest and 64 using a sha512 digest without the key being 0;

And from what I understand in the code, it is less cpu intensive to use the bitwise shift operator to create replicas than creating a new digest.

This would increase the number of cache slots dramatically, which would cut down on key redistribution time while not increasing the continuum time.

Testing on my side looked pretty good.

tcf909 commented 7 years ago

And to clarify my comments:

we use the var x which is a result of self.digest x = self.digest(server.host +'-'+ i);

self.digest returns a buffer of varying lengths depending on the digest used. In the case of MD5 it has a length of 16.

we then create the var key by using 4 indexes of var x. Currently using [# + j * 4] this would limit our total replicas to 4 ( 16 / 4 ). If we used [# + j] the replica limit would only be limited by the # of indexes in the buffer (as you shift from using the first 4 index to using 3 undefined values and the very last index to create the maximum replica.