linsomniac / python-memcached

A python memcached client library.
461 stars 201 forks source link

Consistent Hashing - Ketama #61

Open haridas opened 10 years ago

haridas commented 10 years ago

Hi,

I have the working implementation of ketama consistent hashing algorithm on top of your library. May I send you pull request ? Or you are keeping the library as minimal as possible ? Please let me know, I would love to share my implementation.

Warm Regards, Haridas N.

sergio97 commented 9 years ago

I would really like to have Ketama in this library as well. Haridas, is your implementation available somewhere that I can get it?

haridas commented 9 years ago

Thanks for showing interest in adding this feature into the library. I will send you a pull request with Ketama implementation done on it.

haridas commented 9 years ago

Hi,

I committed changes on the forked version of this repository. This is the initial version of changes to see how it works and how I was used it before in our environment. Here is the link - https://github.com/haridas/python-memcached/commit/18ac1a8e5aa83b276ea5b9702e5f38ecae4e57e7.

There is no test case added to this implementation, since the testcase setup required multiple memcache server, I'm holding on it. Let me know better methods to handle this scenario.

Here is separate gist of the same implementation with testcase. If there is 8 memcachd server running on your machine, then you can test this script by running python new_memcached.py.

Gist link - https://gist.github.com/e7db5d28536fab4ebe52.git

Looking forward to your comments on the implementation.

Hari

sergio97 commented 9 years ago

Thank you Haridas! I will be making a few tweaks to your code before I use it. In case you want my changes I'll open a pull request to your fork when I'm finished.

haridas commented 9 years ago

Cool ! I would like to see the new changes that you make on the codebase, please update this issue or via PR also fine for me. Thank you.

sergio97 commented 9 years ago

I had forgotten to reference this in the PR. Please have a look :)

haridas commented 9 years ago

Awesome. Thanks for making the code much better. Hoping to getting it merged soon.

sergio97 commented 9 years ago

I'm not so happy with the structure of it at the moment. I think it would be better to create a ketama client like so:

import memcache

mc = memcache.Client('127.0.0.1', ketama=True)

or:

import memcache

mc = memcache.Client('127.0.0.1', placement="ketama") # default is "modula"

I'm not sure exactly how to structure the code to achieve this and I just wanted to get it working for now. Combining the two classes into 1 would be messy. I was thinking to pull out "key placement" logic into its own class, the same way that servers are their own _Host class.

Any input?

haridas commented 9 years ago

I was thinking the same thing before making this commit. We can achieve this by use of Meta class. So the overloaded methods can be injected based on the client type ( ketma or normal) on the class creation time itself. By default the client type will be normal.

But then I thought, existing library is very simple to use and flat in structure, and adding this feature by injecting everything in to the Client class would increase the complexity to maintain and use it. So separately shipping an new client is a good trade off for library users and library maintainers.

ehrmann commented 9 years ago

Glad to see this might actually happen!

Around line 1508, I was wondering if bisect.bisect_left might be a better choice, i.e.

slot = bisect.bisect_left(self._ketama_server_slots, h_key)
if slot < len(self._ketama_server_slots):
    server = self._ketama_server_ring[slot]
    if server.connect():
        return (server, key)

bisect.bisect_left will run in log(n) time.

haridas commented 9 years ago

Its very good suggestion. I never came across the bisect module. I think it internally does binary search to locate the position, hence the log(n) time complexity.

I will surely update my commit.

Thank you, Hari.

On Wed, Apr 1, 2015 at 4:31 AM, David Ehrmann notifications@github.com wrote:

Glad to see this might actually happen!

Around line 1508 https://github.com/haridas/python-memcached/commit/18ac1a8e5aa83b276ea5b9702e5f38ecae4e57e7#diff-894c02c873c7b32549152655db8797a8R1508, I was wondering if bisect.bisect_left might be a better choice, i.e.

slot = bisect.bisect_left(self._ketama_server_slots, h_key) if slot < len(self._ketama_server_slots): server = self._ketama_server_ring[slot] if server.connect(): return (server, key)

bisect.bisect_left will run in log(n) time.

— Reply to this email directly or view it on GitHub https://github.com/linsomniac/python-memcached/issues/61#issuecomment-88278207 .

Have a nice day. Haridas N.

haridas commented 9 years ago

@sergio97 Can you please send a PR from your fork to my fork's master ? I would love to have your changes on my fork also :)

ehrmann commented 9 years ago

Since you were asking about testing it, one way would be to restructure the code so you inject a Sharder to the Client rather than inheriting and overriding. The benefit is that you can test the Ketama implementation separately, outside of the client code. But this is also a very "Java" way of doing things.

fluffy-critter commented 4 years ago

I was wondering if any consistent hashing functionality ever made it in? Browsing the source it looks like server selection is still using simple modulo assignment, per https://github.com/linsomniac/python-memcached/blob/bad41222379102e3f18f6f2f7be3ee608de6fbff/memcache.py#L432