Doist / hash_ring

Implements consistent hashing in Python (using md5 as hashing function)
http://amix.dk/blog/viewEntry/19367
108 stars 38 forks source link

Hash_ring differs slightly from libketama implementation #15

Open ghost opened 9 years ago

ghost commented 9 years ago

We're using hash_ring + python_memcached as our memcache client. We were investigating switching to twemproxy (Twitter's memcache proxy), and were expecting that both hash_ring and twemproxy would produce identical consistent hash results since both are derived from libketama.

It turned out the results were different about 25% of the time, and on further investigation, I noticed that hash_ring has a subtle loop index issue where it uses only 12 out of the 16 bytes of the md5 digest: https://github.com/Doist/hash_ring/blob/master/hash_ring/ring.py#L83

Libketama and twemproxy use all the 16bytes: https://github.com/RJ/ketama/blob/master/libketama/ketama.c#L448 https://github.com/twitter/twemproxy/blob/master/src/hashkit/nc_ketama.c#L182

ghost commented 9 years ago

A coarse workaround we're using is to change the following constants in twemproxy's nc_ketama.c from:

define KETAMA_POINTS_PER_SERVER 160

pointer_per_hash = 4

to

define KETAMA_POINTS_PER_SERVER 120

pointer_per_hash = 3

With this change, twemproxy and hash_ring choose the same server for a given key. Not ideal, but works for us.