jaysonsantos / python-binary-memcached

A pure python module (thread safe) to access memcached via it's binary protocol with SASL auth support.
MIT License
165 stars 57 forks source link

"error: integer out of range for 'L' format code" due to timeout=0 #48

Open hheimbuerger opened 8 years ago

hheimbuerger commented 8 years ago

I'm using django-bmemcached and python-binary-memcached with django-brake for rate limiting.

Occasionally, I'm getting the following exception:

  File "/app/.heroku/python/lib/python2.7/site-packages/django/core/handlers/base.py", line 132, in get_response 
    response = wrapped_callback(request, *callback_args, **callback_kwargs) 
  File "/app/.heroku/python/lib/python2.7/site-packages/brake/decorators.py", line 87, in _wrapped 
    _backend.count(func_name, request, ip, field, period) 
  File "/app/.heroku/python/lib/python2.7/site-packages/brake/backends/cachebe.py", line 65, in count 
    cache.set(key, (count, expiration), timeout=int(expiration - time.time())) 
  File "/app/.heroku/python/lib/python2.7/site-packages/django/core/cache/backends/memcached.py", line 91, in set 
    if not self._cache.set(key, value, self.get_backend_timeout(timeout)): 
  File "/app/.heroku/python/lib/python2.7/site-packages/bmemcached/client.py", line 161, in set 
    returns.append(server.set(key, value, time)) 
  File "/app/.heroku/python/lib/python2.7/site-packages/bmemcached/protocol.py", line 506, in set 
    return self._set_add_replace('set', key, value, time) 
  File "/app/.heroku/python/lib/python2.7/site-packages/bmemcached/protocol.py", line 477, in _set_add_replace 
    time, str_to_bytes(key), value)) 
error: integer out of range for 'L' format code 

It turns out that the problem is django-bmemcached passing a value of -1 as the argument timeout to python-binary-memcached. The latter tries to pack this as an L (unsigned long) into the struct, which of course fails for -1.

The -1 comes from the following code in BaseMemcachedCache.get_backend_timeout():

elif int(timeout) == 0:
    # Other cache backends treat 0 as set-and-expire. To achieve this
    # in memcache backends, a negative timeout must be passed.
    timeout = -1

The code before that indicates that it occures when an entry is set the moment it expires:

cache.set(key, (count, expiration), timeout=int(expiration - time.time()))

According to the docs, a timeout of 0 for cache.set() is valid and won't cache the value. However, that's apparently not how python-binary-memcached works.

Here I get a bit lost, just starting to use these libraries for the first time. I'm already pretty confused which code comes from which library. Can you help me out?

jaysonsantos commented 8 years ago

Hi @hheimbuerger I was checking memcached docs and they don't say anything about keys that never expire, the best way to do this would be add get_backend_timeout to django-bmemcached to map the -1 to 0xffffffe which is the maximum allowed time on memcached, 0xfffffff in some cases would make it return not found. Basically python-binary-memcached is the lib that talks directly to memcached and django-bmemcached is just a wrapper to python-binary-memcached to work with django.

hheimbuerger commented 8 years ago

@jaysonsantos So are you saying the comment in the Django source code ("To achieve this in memcache backends, a negative timeout must be passed.") is wrong and we should open a ticket with Django?

I don't really understand your comment on changing it to 0xffffffe. Wouldn't that make the value never expire? My understanding of the term "set-and-expire" (and the intention) is that it's the complete opposite!

Also, I don't think the original intention here was ever to explicitly say something about the timeout. The timeout 'just happens to come out of 0' by sheer coincidence and rounding errors.

hheimbuerger commented 8 years ago

As a temporary workaround, I'm now overwriting django-brake's CacheBackend.count() and I've changed

cache.set(key, (count, expiration), timeout=int(expiration - time.time()))

to

cache.set(key, (count, expiration), timeout=int(expiration - time.time()) or 1)

to prevent the timeout==0 edge case—assuming the core intention was the 'set-and-expire' the Django source code references.

@gmcquillan: Allow me to mention you here, because you appear to be the core maintainer of django-brake. Please let me know if you think this is a django-brake issue and I can move the issue there. I'm still unclear in which of the three libraries this should be fixed.

I'm a bit surprised that I'm the only (or at least first) one experiencing this. I thought I'm pretty much using all three libraries in out-of-the-box configuration.

jaysonsantos commented 8 years ago

1 - No, normally what libs like pylibc do is sending 0xffffffff when you send -1, as this seems to be the default behavior python-binary-memcached should also use it. 2 - memcached does not have the concept as never expire (if you send 0 it will do it but they don't guarantee that time).

jaysonsantos commented 8 years ago

Could you try your previous code with this commit [1]? [1] https://github.com/jaysonsantos/python-binary-memcached/commit/c399163ea84b9731b7e81c514b5782459f840039

hheimbuerger commented 8 years ago

@jaysonsantos The "never expire" was confusing, I take that back. What I meant was "be kept around for as long as memcached is willing to hold on to it".

Reading your commit, it looks to me like that's what's going to happen when django-brake passes down 0 as the timeout. I'm pretty sure that's not what I want (because my memcached is very limited and also because I'm worried about the side effects this might have on django-brake), so I'm hesitant to deploy that change.

Are you willing to elaborate a bit more on the intentions and consequences your change has?

gmcquillan commented 8 years ago

@hheimbuerger which version of python, django, and django-brake are you using?

jaysonsantos commented 8 years ago

@hheimbuerger I just mimicked pylibmc's behavior which is not wrong, I didn't see that coming before.

hheimbuerger commented 8 years ago

@gmcquillan I'm using Python 2.7.11 (on Heroku's Cedar platform, i.e. Linux), django 1.8.12 and django-brake 1.5.2. Thanks for taking a look, guys!

gmcquillan commented 8 years ago

The value being passed to Django's cache backend is an int. It looks like it's being coerced to a Long before reaching bmemcache's protocol._set_and_replace function. I'd look there.

hheimbuerger commented 8 years ago

@gmcquillan More specifically, the value being passed in by django-brake is not just an int, but it can be 0. That's the root of the problem!

brake/backends/cachebe.py:65:

cache.set(key, (count, expiration), timeout=int(expiration - time.time()))

Now when expiration ~ time.time(), then timeout=0. Django core (BaseMemcachedCache.get_backend_timeout()) turns this into -1 (see my first post), which python-binary-memcached tries to pack into an 'unsigned long' to send it over the wire to memcached.

Unsurprisingly, packing -1 into an unsigned long fails.

I think you're right that this isn't really django-brake's fault. After all, timeout=0 is a documented case of Django's caching layer. However, it was the easiest to work around from django-brake, because there I could just prevent timeout from ever being 0.

gmcquillan commented 8 years ago

Are you defining your periods in units smaller than a second? How would the timeout be getting set to 0 with django-brake?

hheimbuerger commented 8 years ago

@gmcquillan I'm not quite following. Let's say expiration was 1461678289.607911. Now let's say time.time() is 1461678289.0, then

timeout=int(expiration - time.time())
timeout=int(1461678289.607911 - 1461678289.0)
timeout=int(0.607911)
timeout=0

That's how the timeout is set to 0 by django-brake. It happens in the last second before expiration is reached.

gmcquillan commented 8 years ago

@hheimbuerger I grok it now, thanks!

However, 0 should mean that the value is not cached. -1, or None is cached forever. https://docs.djangoproject.com/en/1.9/topics/cache/#cache-arguments

A value of 0 causes keys to immediately expire (effectively “don’t cache”).

We could certainly add a configuration value for making sure that this case is never hit. However, if the cache backend is well behaved, then it should be a noop, right?

hheimbuerger commented 8 years ago

@gmcquillan Well, that was my concern about the change @jaysonsantos just made in c399163. It looked to me this will turn 0 into 'cache for a really long time'.

But reading it again now, I might be wrong. I'm still a bit confused about the where in the different layers across these three libraries things are happening and what I'm looking at.

Update: Nope, I still think a timeout==0 will get turned into time==-1 by the django core, which means python-binary-memcached will send MAXIMUM_EXPIRE_TIME over the wire to memcached.

hheimbuerger commented 8 years ago

What's the status of this issue? I still think your interpretation of time == -1 as MAXIMUM_EXPIRE_TIME is in violation of the Django code, which assumes that it means "set-and-expire".

jaysonsantos commented 8 years ago

Did you check what happens to pylibmc? My idea is to have the same behavior, and if I am not wrong it will be the max time - (abs(value)), and also on memcached there is no concept of infinity. The right fix would be to change the expire time to that calc.

gmcquillan commented 8 years ago

I was curious about the official Django stance on this, and it looks like they decided to go with emitting -1 to Memcached to indicate that the key should be stored as long as there's room in the slab.

https://github.com/django/django/commit/89955cc35f3636684ea6f2a6c9504b35a3050f0f#diff-d603fa6afef1a44825847ccd4e9683a6R51

Henrik,

If you're having a verified problem where the TTL is set to 0 (and the TTL is then interpreted as "keep as long as possible"), then feel free to send me a PR where the behavior of the TTL setting is adjustable (preferably through a settings change. I freely admit that different libraries probably have different semantics. So it'll be good to have flexibility, especially when something like TTL is so crucial to the functioning of a ratelimiting library.

Cheers, -Gavin

On Fri, Aug 19, 2016 at 8:40 AM, Jayson Reis notifications@github.com wrote:

Did you check what happens to pylibmc? My idea is to have the same behavior, and if I am not wrong it will be the max time - (abs(value)), and also on memcached there is no concept of infinity. The right fix would be to change the expire time to that calc.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jaysonsantos/python-binary-memcached/issues/48#issuecomment-241053587, or mute the thread https://github.com/notifications/unsubscribe-auth/AATHHLaQizucEK_8cL6uVTnyUAWuWcEWks5qhc6BgaJpZM4IQARv .

hheimbuerger commented 8 years ago

That's an interesting change you found there in Django! That's for 1.10, correct? I'm still in the process of upgrading from 1.8 to 1.9...

I'll have to think about that, I'm still pretty confused. Or debug through the whole stack, but that's not so easy for me to understand, too.

Are you convinced that your current solution is working correctly with Django 1.8–1.10 and are you planning to release it to PyPI soon?

gmcquillan commented 8 years ago

Yeah, it's from the first instance I could find of the Django cache system making a decision about what the TTL values ought to be.

I've only tested it through Django 1.8.2.

On Fri, Aug 19, 2016 at 10:28 AM, Henrik Heimbuerger < notifications@github.com> wrote:

That's an interesting change you found there in Django! That's for 1.10, correct? I'm still in the process of upgrading from 1.8 to 1.9...

I'll have to think about that, I'm still pretty confused. Or debug through the whole stack, but that's not so easy for me to understand, too.

Are you convinced that your current solution is working correctly with Django 1.8–1.10 and are you planning to release it to PyPI soon?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jaysonsantos/python-binary-memcached/issues/48#issuecomment-241081188, or mute the thread https://github.com/notifications/unsubscribe-auth/AATHHGzpT-JqDK_KFqgZ9o539QId4I90ks5qheengaJpZM4IQARv .

jaysonsantos commented 8 years ago

I found something clarifying, https://github.com/memcached/memcached/issues/142#issuecomment-222334563 on the ASCII protocol this is what happens

if (exptime < 0)
        exptime = REALTIME_MAXDELTA + 1;

But in the binary, there is no negative ttl, the right thing to do is to mimic this behavior.