uber-node / ringpop-node

Scalable, fault-tolerant application-layer sharding for Node.js applications
http://uber.github.io/ringpop/
MIT License
1.18k stars 146 forks source link

hashring replica point concatenation can create slightly imbalanced rings in particular circumstances #241

Open benfleis opened 8 years ago

benfleis commented 8 years ago

Replica points are added by concatenation onto the server name, which is currently of the form a.b.c.d:X. If you have a=1.2.3.4:55 and b=1.2.3.4:555, you will have some overlap, e.g. with replica points 5 and 55, respectively, where a + '55' === b + '5', and therefore hash identically.

This is probably an atypical configuration, and probably not a major concern, but as we consider a future with logical IDs, this could become a bigger of a potential question mark.

Probably the simplest thing is simply to insert '/' or '#' in between. In any consistent naming scheme, this would make all points unique.

Some trivial code to demonstrate this is below -- no matter how much you adjust the offset, 's1:11' is always underrepresented.

test('random hosts with good points show good distribution', function t(assert) {
    var ring = new HashRing({ replicaPoints: 100 });
    var servers = ['s1:1', 's1:11', 's2:2', 's2:5'];
    var counts = {};
    for (var i = 0; i < servers.length; i++) {
        var server = servers[i];
        ring.addServer(server);
        counts[server] = 0;
    }

    var offset = 20 * 1000000;
    for (var i = 0; i < 1000000; i++) {
        s = ring.lookup((i + offset) + '');
        counts[s]++
    }
    console.log(counts);
    assert.end();
});
thanodnl commented 8 years ago

@CorgiMan Recently found the same. Although the fix is simple it is hard to deploy to running systems without a lot of forwarding inconsistencies. The moment you start upgrading your system after such a fix there will be disagreement on key ownership till the upgrade has completely completed.

Since the chances of such port configurations are slim in our current situation it might break more than it solves.