3rd-Eden / node-hashring

hashring is a consistent hashing algorithm for Node.js that is compatible with libketama and python's hash_ring package
MIT License
350 stars 61 forks source link

Does the hash depend on the order of the servers AND the server name? #11

Closed pselden closed 10 years ago

pselden commented 11 years ago

I was doing some tests to see how servers from 1 hash ring would compare to servers on another hash ring.

https://gist.github.com/pselden4/0fd450631f067ad7ea40

I was expecting hashes to differ on server name OR server order, but not both. However, these tests show that a key gets hashed to a server based on both name and the order in which it was added. Is that expected? I would expect server name to affect the hash, but not the order. Am I wrong in that expectation?

pselden commented 11 years ago

I dug in a little and realized that multiple nodes are getting put on the same ring number, so the last server that maps to that ring number wins. This also helps to explain #3 -- the nodes that appear later in the list are clobbering the earlier ones, so more keys end up mapping to them.

The surprising thing to me is that there are a LOT of collisions on the ring. For instance, look at this grouping:

{ key: 959657777, node: 'RedisShard-03:50105' }, { key: 959657777, node: 'RedisShard-03:50085' }, { key: 959657777, node: 'RedisShard-03:50115' }, { key: 959657778, node: 'RedisShard-01:50024' }, { key: 959657778, node: 'RedisShard-01:50054' }, { key: 959657778, node: 'RedisShard-03:50099' }, { key: 959657778, node: 'RedisShard-03:50072' }, { key: 959657778, node: 'RedisShard-03:50110' }, { key: 959657779, node: 'RedisShard-01:50004' },

3 collisions for the first key and 5 collisions for the second key!

3rd-Eden commented 11 years ago

I haven't been able to track down the problem of this, but i'm more than happy to accept pull requests regarding this matter.

3rd-Eden commented 11 years ago

FYI:

The distribution that is calculated in my test suite is actually quite good:

{
 '192.168.0.104:11212': 31887,
 '192.168.0.103:11212': 34081,
 '192.168.0.102:11212': 34032
}

So it might have to do with the way the keys are hashed, when using whirlpool as algorithm:

 {
  '192.168.0.103:11212': 33061,
  '192.168.0.102:11212': 33106,
  '192.168.0.104:11212': 33833
}
pselden commented 11 years ago

The distribution gets much worse when you add more nodes.

Replace the first few lines of your distribution test to something like this:

    var iterations = 100000
      , nodes = { }
      , numNodes = 128
      , ring;

    for(var n = 1; n <= numNodes; n++){
      var nodeName = '192.168.0.' + n + ':11212';
      nodes[nodeName] = 1;
    }

And the test will fail.

{ '192.168.0.1:11212': 405,
  '192.168.0.2:11212': 413,
  '192.168.0.3:11212': 538,
  '192.168.0.4:11212': 444,
  '192.168.0.5:11212': 616,
  '192.168.0.6:11212': 647,
  '192.168.0.7:11212': 402,
  '192.168.0.8:11212': 374,
  // ...
  '192.168.0.126:11212': 1400,
  '192.168.0.127:11212': 1462,
  '192.168.0.128:11212': 1317 }

As I mentioned, it is because there are many collisions on the ring when the nodes are added to it, so this line: https://github.com/3rd-Eden/node-hashring/blob/master/index.js#L143 sill favor the last hash that is added to that position on the ring.

I'll dig a bit to try to figure out why there are so many collisions.

Edit: also, if you log a histogram of the ring values for the node in this test you end up with this:

https://gist.github.com/pselden4/150952cd17cee0825a5f

Notice that there is a LARGE grouping of collisions at lower key numbers, and then there is a big gap in the key ring, and then stuff starts to even out a little bit more (but still some collisions).

pselden commented 11 years ago

There are so many collisions because of the "long" conversion that is being done, and because the crc32 implementation here only has 10 digits to choose from because of the mapping from digit position to charCode.

I'm not sure what the purpose of looping from k = 0 to k = 3 when you generate the ring, but inside of the hashValue function it will end up looking at indexes that are outside of the array (and thus = 0). When the hash array has a length 10, when k = 2 it will loop through positions 9, 10, 11, 12, only hash[9] will have any values, so that will be the one used in the placement. When the hash is < 10 length.

I'm not exactly sure what the strategy you're going for there is, but perhaps if you allowed the hash function to just return a number and not an array (just remove the hashValue function) and don't do that looping over an array, it might have a better distribution.

The second thing that could be done to make it so order doesn't matter is to do the sort according to both key AND node name. This involves keeping a sorted list of key/nodename so you can do the comparison.

Because this involves so many changes to the core functionality, I'm not comfortable just doing a pull request that would probably change everything, but I may look into just creating a new repository and then in the future we could decide whether it is worth merging the two.

Edit: also using murmur hash 3 (https://github.com/garycourt/murmurhash-js/blob/master/murmurhash3_gc.js) and increasing the vnode_count would help.

ghost commented 11 years ago

+1 we are seeing this exact issue too; we're using 3rd eden memcached and seeing way more keys ending up on one server than the others. Also the name seems to affect the hashing. We are investigating too.

3rd-Eden commented 11 years ago

@pselden4 i'll be more then happy to merge in the changes and / or add you as contributor. Because this seems to be pressing issue for the both of you.

pselden commented 11 years ago

Right now I just have the constructor and get implemented, not add/remove/replace/range -- so there would still be a lot to do.

Here's what I'm doing for the constructor and get though: https://gist.github.com/pselden4/1683fac5bc0f6b2bdb2c

The gist of it is to store the hash value and an index into the list of names (so we don't have to store the name multiple times). It does a binary search for the correct position in the ring and then verifies that it chose the first one.

MurmurHash3 is just an object that implements getHash(key) and returns an integer.

I don't think I'll have time to implement the rest of the stuff in the near future, but this code is free to use or modify for anyone who does want to give it a shot.

The other problem is -- like you said, upgrading will be a breaking change.

3rd-Eden commented 11 years ago

Upgrading would just mean releasing a new major version. Anyways, I'm also diving back in to the code to see if I can figure out where it goes wrong.

3rd-Eden commented 11 years ago

Hey,

I have spend my weekend trying to track this stuff down and attempt to find incorrect ring behaviour. I uncovered a couple of thing. One thing is that my ring is overriding some values during the generation. But the biggest issue is indeed caused by the hashvalue function. https://github.com/3rd-Eden/node-hashring/blob/master/index.js#L367-L399 It needs to support 64bit bitshifting but JavaScript only supports 32bit bitshifting. The only decent solution I have found for this is to write the bitshifting sections of the code in a native addon in c++.

I'm workin on and off on it this weekend as it's also my girlfriends birthday and she probably wants some attention as well. But i'll keep you guys posted on my refactor.

pselden commented 11 years ago

Do you actually need a 64 bit value in the general case -- or is that just an improvement to improve the distribution?

If you allowed the user to pass in whatever hashing algorithm and just specified that it needs to return an integer, then I think it would be a lot simpler.

For instance, I'm using a 32 bit murmurhash, and it seems to work relatively well. If I decided to use a 64 bit version of it to improve the distribution, it would be up to me.

Does that make sense?

3rd-Eden commented 11 years ago

It mostly depends on the length of the keys that you are looking up. Calculating the hash value of we are the knights who say Ni foo bar on python using an md5 algorithm yields 3570309854 as hash value. If do the same test on node you would get -724657442. Which I consider a major issue (which i'm actively resolving atm)

pselden commented 11 years ago

I guess one question that needs to be answered is whether or not this is meant to be a general-case hash-ring solution or a solution that matches another language/framework's implementation directly.

Btw, I was able to implement add/remove/replace relatively easily: https://gist.github.com/pselden4/33c283e6e98aba82b556

I still need to add the range function and maybe caching?

Edit: can you explain a little bit about what range does? I'm having a hard time grokking it.

pselden commented 11 years ago

I just had an idea -- and I'm not sure if I'm missing something important here, but it is it even necessary to convert non-integer hashes to integers? I just updated my implementation to support any hash that can use < and > to compare hashes (before I was using subtraction for comparisons).

https://gist.github.com/pselden4/33c283e6e98aba82b556

Then you can do something like this:

var CryptoHasher = function(algorithm){
  this.algorithm = algorithm;
};
var crypto = require('crypto');
CryptoHasher.prototype.getHash = function(key){
  return crypto.createHash(this.algorithm).update(key).digest('hex');
};

var md5 = new CryptoHasher('md5');
var ring = new HashRing(servers, md5);

It will do string comparisons on the hash (which are going to be a bit slower than integer comparisons) -- but it will be able to sort it correctly.

3rd-Eden commented 11 years ago

I have merged in my refactor that I have been doing the last couple of days (still working on the docs before i do a new release), it now provides the same distribution as libketama instead of being based on it. It also has the option to provide the same distribution as the python hash_ring module. Both of these prominent libraries have same key disruption as you would see when you add another server to the distribution test.

Not sure what causes this, but I'm guessing that it can be resolved when using a different hashing algorithm.