Koed00 / django-q

A multiprocessing distributed task queue for Django
https://django-q.readthedocs.org
MIT License
1.84k stars 289 forks source link

Strange memcache issue #412

Open karimone opened 4 years ago

karimone commented 4 years ago

Hello, I need your help to investigate an issue I have using django-q and memcached, but I'm not sure if is related to django-q or not

The tasks I'm running are all good except that sometimes I get a memcached error especially when my task need to access the cache.

For example in this traceback I'm using waffle and waffle uses the cache.


File "/opt/python/bundle/49/app/ec/domain/services/partnership.py" in get_discount_percentage_available_for_policy
  91.     if waffle.switch_is_active("global_policy_discount_active"):

File "/opt/python/run/venv/local/lib/python3.6/site-packages/waffle/__init__.py" in switch_is_active
  21.     switch = Switch.get(switch_name)

File "/opt/python/run/venv/local/lib/python3.6/site-packages/waffle/models.py" in get
  44.         cached = cache.get(cache_key)
...

This is how it ends up

File "/opt/python/run/venv/local/lib/python3.6/site-packages/pymemcache/client/base.py" in _extract_value
  791.         key = remapped_keys[key]

Exception Type: KeyError
Exception Value: b':1:django_q:default:cluster'
Request data not supplied

I get this weird error from pymemcached. This happens only sometimes and I'm not able to reproduce it.

Do you know what does it mean django_q:default:cluster? If you can give me any clue, my first goal is to reproduce the error so I can test it and fix. I suspect that is something related to the packages I'm using, but I'm not sure.

I also tried a script on the staging env that runs task that access to the cache, but I wasn't able to reproduce the error. If you can help me, I will really appreciate it.

Thank you

denniseijpe commented 4 years ago

I have exact the same issue. Some queries to memcached run fine. But once in a while, instead of the stored value, I receive an array with two identifiers (I run a cluster of two memcached instances).

Where you able to solve it?

Koed00 commented 4 years ago

Q_STAT = f"django_q:{PREFIX}:cluster"

That is the key used to save statistics to the current cache backend. It was initially written to use Redis and then changed to use the generic cache. I have no idea how this is affecting memcached, but maybe you guys can piece it together?

denniseijpe commented 4 years ago

I'm currently checking whether multiprocessing has anything to do with it. That's the only major difference I can think of compared to other apps.

So far I've checked:

denniseijpe commented 4 years ago

Setting cache to None in Q_CLUSTER solves the issue for me.

Same goes for disabling set_stat in brokers/__init__.py.

This code gives the same result as the failing cache.get calls:

        key_list = self.cache.get(Conf.Q_STAT, [])
        print(key_list)

['django_q:myproject:cluster:e4ecabcc-d7e0-4dcf-9c8f-8d731bc6f760']

I couldn't find any references using Google for issues with memcached and Python's multiprocessing.

The issue seems that there is thread-unsafety in cache.get. When I change my Django settings to include a copy of my cache settings:

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
        'LOCATION': 'memcached-1-memcached-svc.default.svc.cluster.local:11211',
        'KEY_PREFIX': 'key1',
    },
    'djangoq': {
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
        'LOCATION': 'memcached-1-memcached-svc.default.svc.cluster.local:11211',
        'KEY_PREFIX': 'key2',
    }
}

And change the Q_CLUSTER config's cache key to djangoq, the issue is gone. Django Q now uses another instance of a memcached client.

Q_CLUSTER = {
    'name': 'myproject',
    'orm': 'default',
    'cache': 'djangoq',
}