3rd-Eden / node-hashring

hashring is a consistent hashing algorithm for Node.js that is compatible with libketama and python's hash_ring package
MIT License
350 stars 57 forks source link

Uneven distribution of shards using node-hashring #3

Closed joecroninallen closed 11 years ago

joecroninallen commented 12 years ago

I use node-hashring on a Redis environment where I have 128 instances of redis in a sharded environment. Half of the instances live on one box, and the other half live on the second box. For some reason, the instances living on the second box seem to be getting twice as much data as the instances on the first box.

I did not use weights, but it is as if all the instances on the second box had a weight of 2 whereas the instances on the first box had a weight of 1.

Do you have any explanation as to what might be happening?

3rd-Eden commented 12 years ago

Hello,

I don't have an explanation for it.. Does it get twice as much data, or twice as much keys?

joecroninallen commented 12 years ago

What we noticed was that of the 128 nodes, it seemed like the first 64 nodes averaged about half of the number of keys than the later 64 nodes. This resulted in twice as much data. You can test the basic case by iterating from 0 to some large number, and see the way it gets distributed. What I would expect is that nodes distribute fairly evenly, but that doesn't happen. I will follow up with some sample test results that we found.

joecroninallen commented 12 years ago

I don't see where I could leave an attachment so I will paste the code we were using to test. You can try it out and see what kind of results. We tried a couple of string hashing algorithms, and so far they all do the same thing where certain nodes get significantly more than other nodes. The results do not follow the typical bell curve that you get if you just generated a bunch fo random numbers between 1 and 128 for example. Changing the the hashing algorithm or the max # of nodes didn't change things too much. Very weird.

joecroninallen commented 12 years ago
//----------------------------------------------------------------------------------------------------------------
// testhashLinear.js
//
// Build a hashring of 128 nodes, hashing random integers
//
// Using hashring.js or djb2, the nodes are skewed and unbalanced.
//
// This test computes which bucket a number hashes into, keeping track of the frequency, and dumping it at the end
// of the test.  Several different hash functions are included, and commented out in testLinearKeys
//
//----------------------------------------------------------------------------------------------------------------

var fs = require('fs');
var sys = require('sys');
var HashRing = require("hashring");
var StringDecoder = require('string_decoder').StringDecoder;
var crypto = require('crypto');

var nodemap = ["RedisShard-01:50001", "RedisShard-01:50002", "RedisShard-01:50003", "RedisShard-01:50004", "RedisShard-01:50005", "RedisShard-01:50006", "RedisShard-01:50007", "RedisShard-01:50008", "RedisShard-01:50009", "RedisShard-01:50010", "RedisShard-01:50011", "RedisShard-01:50012", "RedisShard-01:50013", "RedisShard-01:50014", "RedisShard-01:50015", "RedisShard-01:50016", "RedisShard-01:50017", "RedisShard-01:50018", "RedisShard-01:50019", "RedisShard-01:50020", "RedisShard-01:50021", "RedisShard-01:50022", "RedisShard-01:50023", "RedisShard-01:50024", "RedisShard-01:50025", "RedisShard-01:50026", "RedisShard-01:50027", "RedisShard-01:50028", "RedisShard-01:50029", "RedisShard-01:50030", "RedisShard-01:50031", "RedisShard-01:50032", "RedisShard-01:50033", "RedisShard-01:50034", "RedisShard-01:50035", "RedisShard-01:50036", "RedisShard-01:50037", "RedisShard-01:50038", "RedisShard-01:50039", "RedisShard-01:50040", "RedisShard-01:50041", "RedisShard-01:50042", "RedisShard-01:50043", "RedisShard-01:50044", "RedisShard-01:50045", "RedisShard-01:50046", "RedisShard-01:50047", "RedisShard-01:50048", "RedisShard-01:50049", "RedisShard-01:50050", "RedisShard-01:50051", "RedisShard-01:50052", "RedisShard-01:50053", "RedisShard-01:50054", "RedisShard-01:50055", "RedisShard-01:50056", "RedisShard-01:50057", "RedisShard-01:50058", "RedisShard-01:50059", "RedisShard-01:50060", "RedisShard-01:50061", "RedisShard-01:50062", "RedisShard-01:50063", "RedisShard-01:50064", "RedisShard-03:50065", "RedisShard-03:50066", "RedisShard-03:50067", "RedisShard-03:50068", "RedisShard-03:50069", "RedisShard-03:50070", "RedisShard-03:50071", "RedisShard-03:50072", "RedisShard-03:50073", "RedisShard-03:50074", "RedisShard-03:50075", "RedisShard-03:50076", "RedisShard-03:50077", "RedisShard-03:50078", "RedisShard-03:50079", "RedisShard-03:50080", "RedisShard-03:50081", "RedisShard-03:50082", "RedisShard-03:50083", "RedisShard-03:50084", "RedisShard-03:50085", "RedisShard-03:50086", "RedisShard-03:50087", "RedisShard-03:50088", "RedisShard-03:50089", "RedisShard-03:50090", "RedisShard-03:50091", "RedisShard-03:50092", "RedisShard-03:50093", "RedisShard-03:50094", "RedisShard-03:50095", "RedisShard-03:50096", "RedisShard-03:50097", "RedisShard-03:50098", "RedisShard-03:50099", "RedisShard-03:50100", "RedisShard-03:50101", "RedisShard-03:50102", "RedisShard-03:50103", "RedisShard-03:50104", "RedisShard-03:50105", "RedisShard-03:50106", "RedisShard-03:50107", "RedisShard-03:50108", "RedisShard-03:50109", "RedisShard-03:50110", "RedisShard-03:50111", "RedisShard-03:50112", "RedisShard-03:50113", "RedisShard-03:50114", "RedisShard-03:50115", "RedisShard-03:50116", "RedisShard-03:50117", "RedisShard-03:50118", "RedisShard-03:50119", "RedisShard-03:50120", "RedisShard-03:50121", "RedisShard-03:50122", "RedisShard-03:50123", "RedisShard-03:50124", "RedisShard-03:50125", "RedisShard-03:50126", "RedisShard-03:50127", "RedisShard-03:50128"];

 var optionsJSON = {
     vnode_count:0
 }

//var ring = new HashRing(nodemap,'MD5',optionsJSON);
var ring = new HashRing(nodemap);
var gKeys = [];

//----------------------------------------------------------------------------------------------------------------
function getIndexForStr(computedIdStr) {
    for (var i=0; i < gKeys.length; ++i) {
        if (gKeys[i]._key == computedIdStr) {            
            return i;
        }
    }

    var newIndex = gKeys.length;

    gKeys[newIndex]= { "_key":computedIdStr, "_frequency":0 }

    return newIndex;
}

//----------------------------------------------------------------------------------------------------------------
// Used to compare against hashring + crc32, nearly identical results
function hash2(hashKey) {
    var hash = 0;

    for (i = 0; i < hashKey.length; i++) {
        var charCode = hashKey.charCodeAt(i);
        hash = ((hash<<5)-hash)+charCode;
        hash = hash & hash; // Convert to 32bit integer
    }

    return hash;

}

//----------------------------------------------------------------------------------------------------------------
// Used to compare against hashring + crc32, nearly identical results
function djb2(hashKey) {

    var hash = 5381;

    for (var i=0; i < hashKey.length; ++i) {
        var c = hashKey.charCodeAt(i);
        hash = ((hash << 5) + hash) + c;
    }

    return hash;
}

function randomMinToMax(min, max) {
   var range = max-min+1;
   return Math.floor(Math.random()*range+min);
}

//----------------------------------------------------------------------------------------------------------------
function testLinearKeys(startIndex, numKeys) {
    console.log('---------------');
    console.log('testLinearKeys startIndex='+startIndex+' numKeys='+numKeys);
    console.log('---------------');

    var startTime = (new Date()).getTime();

    gKeys = [];
    for (var i=startIndex; i < startIndex+numKeys; ++i) {
        //var hashkey = ''+i;
        var hashkeyInt = randomMinToMax(0,100000000);
        var hashkey = ''+hashkeyInt;

        // hashring.js test
        var computedHash = ring.getNode(hashkey)

        // Should be perfectly balanced, for comparison
        //var computedHash = nodemap[i%(nodemap.length)];  

        // rand distribution, not perfecly balanced but relatively close
        //var computedHash = nodemap[hashkeyInt%(nodemap.length)];  

        // using djb2
        // var computedHashRaw = djb2(hashkey);
        // var computedHash = computedHashRaw%(nodemap.length);

        // using hash2
        //var computedHashRaw = Math.abs(hash2(hashkey));
        //var computedHash = computedHashRaw%(nodemap.length);

        //console.log('computedHash='+computedHash)
        // histogram the result
        var idx = getIndexForStr(computedHash)
        gKeys[idx]._frequency += 1;
    }

    var endTime = (new Date()).getTime();
    console.log ('etime = '+(endTime-startTime)/1000+' seconds');

}

//----------------------------------------------------------------------------------------------------------------
function dumpStats() {

    console.log('dumpStats gKeys.length='+gKeys.length)

     gKeys.sort(function(itemA, itemB) {        

         if (itemA._frequency > itemB._frequency) {
             return -1;
         } else if (itemA._frequency < itemB._frequency) {
             return 1;             
         } else {
             return 0;
         }

    });

    for (var i=0; i < gKeys.length; ++i) {
        console.log(gKeys[i]._key+' \t\t_frequency= '+gKeys[i]._frequency)
    }
}

//----------------------------------------------------------------------------------------------------------------
function buildRecord(str){
    var record = {}
    str.split(pattern).forEach(function(value, index){
      if(header[index] != '')
        record[header[index].toLowerCase()] = value.replace(/"/g, '')
    })
    return record
}

//----------------------------------------------------------------------------------------------------------------

console.log('testhashLinear start');

console.log('nodemap.length='+nodemap.length);

console.log('---------------');

testLinearKeys(0,100000);
dumpStats();

console.log('---------------');

console.log('testhashLinear end');
3rd-Eden commented 12 years ago

Thanks a lot, I'll it a look to night.

jstockdale commented 12 years ago

It looks to me like a couple of things are happening here. First, the hash keyspace is getting decimated during the hashing process.

In particular, return CreateHash(this.algorithm).update(key).digest('hex').split('').map(function(v){ return v.charCodeAt(0) }) does not return the hex value of each character in the hash, but rather the character code for that value which is not a compact nor hexadecimal value.

What you actually want there is something like return +('0x'+v[0]) which takes the first element of v and forces it to be interpreted as a hex string.

Secondly, the current hashValue function seems wrong, since each element is only a single hex character (a nibble not a byte), so you actually need to read two hex characters for every byte in the hash.

This is further complicated by the original Ketema implementation being little endian. So it should be something like: (key[31-offset(0)]) | (key[31-offset(1)] << 4) | (key[31-offset(2)] << 8) | (key[31-offset(3)] << 12) | (key[31-offset(4)] << 16) | (key[31-offset(5)] << 20) | (key[31-offset(6)] << 24) | (key[31-offset(7)] << 28)

And then the offset function becomes function(x) { return x + k*8 };

This makes md5 work, and if I got the byte ordering right, then it'll map the same as vanilla libketama.

As a side-note, the vnode code is also broken -- you unconditionally override the passed options at this.options = {vnode_count: 40}; and then the weight code factor = Math.floor((this.options.vnode_count * len * weight) / totalweight); should not be dependent on len (that makes each subsequent server get a larger chunk of space as we iterate from 0 to len.

That said:

After I fixed all those things, I still had key distribution issues (down to 600 min and 1000 max per server, which isn't terrible). After further investigation the method Ketama uses to decimate the md5 hash to a hash_value is pretty crappy (as the c++ implementations also show this distribution issue). I'm working on a general solution for that and will report back if I make any progress.

Cheers,

-S

3rd-Eden commented 12 years ago

Thanks for the awesome research and responses. Most people stopped using md5 for hash rings because it wasn't creating a dense enough distribution, current default crc32 is much at this.

jstockdale commented 12 years ago

Thanks for all the work building this and node-memcached :)

Re: CRC32

Yeah. It depends which problem you're trying to solve. CRC has a better 'looking' distribution but it has very poor quality bit 'dispersion', and will have serious trouble hashing certain keyspaces.

See: http://home.comcast.net/~bretm/hash/5.html http://home.comcast.net/~bretm/hash/8.html

It looks like the problem with cryptographic hashes is that with any reasonable number of server points (ie. ~160 as in the original spec) the distribution is still such that the ratio of max count(keys) to min count(keys) across the 128 individual nodes is 1.6:1. This clearly isn't efficient to scale in a large production environment! The problem here is that the hash-space for the keys is randomly distributed but, for lack of a better term, sort of like a fractal generator. You get more and more slices, but they do not trend to a common size.

I'm actually looking at fixing the distribution problem (statistics to the rescue! cc: law of large numbers) by using a great deal more vpoints per server (~4000), and then using the fastest, well-distributed hash I can find (current candidates are MurmurHash3 which on x64 gives you very cheap 128-bit hashes).

So far, it's showing a reasonable performance impact (<10%) and a great deal better distribution (better than 1.1:1).

Anyway, gonna keep working on this and if I figure out anything else useful, I'll let you know.

jstockdale commented 12 years ago

@joecroninallen can I have permission to reuse that snipped of code as a test case? You didn't add any license so I have to assume not otherwise. #SadPanda

joecroninallen commented 12 years ago

Yes, absolutely. Please feel free to use it.

jstockdale commented 12 years ago

Awesome. Thank you!

mhart commented 12 years ago

Has this been resolved at all? Looking to use this lib, but this issue looks a little worrying, especially this: https://github.com/3rd-Eden/node-hashring/issues/3#issuecomment-2952958

3rd-Eden commented 12 years ago

Nope, this isn't fully resolved yet. I haven't found the time to dive deep in to this issue. On other main pain point is even if this a bit broken.

If it where to fix it, it will mess up programs that are currently using the hashring module as there keys will now be stored on different servers.

jstockdale commented 12 years ago

Let me check this out. I did a bunch of research around this, but never published any hard results because first order fixes required substantial impact to the ring-setup overhead. Maybe that is ok, since most sensible data-provider architectures don't ever one-off the ring instantiations.

Sent from my iPhone

On May 3, 2012, at 11:43 PM, Arnout Kazemierreply@reply.github.com wrote:

Nope, this isn't fully resolved yet. I haven't found the time to dive deep in to this issue. On other main pain point is even if this a bit broken.

If it where to fix it, it will mess up programs that are currently using the hashring module as there keys will now be stored on different servers.


Reply to this email directly or view it on GitHub: https://github.com/3rd-Eden/node-hashring/issues/3#issuecomment-5505146

pselden commented 11 years ago

@jstockdale Did you ever come up with something that you could share which would help out with this? The latest version (0.0.8) still has this problem, but you are able to supply your own hash function to it which might be of use in this scenario.

Btw, using 'md5' improves the distribution a bit (still ~600min-1000max), but still appears to favor the nodes that are added later.

jstockdale commented 11 years ago

I made some progress. Sorry I never replied. I'm traveling in New Zealand this week but I'll see if I can't write/code something up next week. I'll update the thread with how it goes. Cheers!

3rd-Eden commented 11 years ago

Closing this, as the rewrite of this module ensures compatibility with libketama and the python hashring, which were and still are the main goals of this module. The distribution can still be finely tuned through vnodes etc.