mitodl / micromasters

Portal for learners and course teams to access MITx Micromasters® programs
https://mm.mit.edu
BSD 3-Clause "New" or "Revised" License
30 stars 17 forks source link

ResponseError: Command # 1 (LLEN 5baa1e67-6288-3559-9315-9723abe9c2fa.reply.celery.pidbox) of pipeline caused er... #4908

Closed sentry-io[bot] closed 3 years ago

sentry-io[bot] commented 3 years ago

Sentry Issue: MICROMASTERS-532

ExecAbortError: Transaction discarded because of previous errors.
  File "redis/client.py", line 3570, in _execute_transaction
    response = self.parse_response(connection, '_')
  File "redis/client.py", line 3635, in parse_response
    result = Redis.parse_response(
  File "redis/client.py", line 853, in parse_response
    response = connection.read_response()
  File "redis/connection.py", line 689, in read_response
    raise response

ResponseError: Command # 1 (LLEN 5baa1e67-6288-3559-9315-9723abe9c2fa.reply.celery.pidbox) of pipeline caused error: OOM command not allowed when used memory > 'maxmemory'.
(22 additional frame(s) were not displayed)
...
  File "redis/client.py", line 3575, in _execute_transaction
    raise errors[0][1]
  File "redis/client.py", line 3562, in _execute_transaction
    self.parse_response(connection, '_')
  File "redis/client.py", line 3635, in parse_response
    result = Redis.parse_response(
  File "redis/client.py", line 853, in parse_response
    response = connection.read_response()
  File "redis/connection.py", line 689, in read_response
    raise response

Unrecoverable error: "ResponseError(\"Command # 1 (LLEN 5baa1e67-6288-3559-9315-9723abe9c2fa.reply.celery.pidbox) of pipeline caused error: OOM command not allowed when used memory > 'maxmemory'.\")"
umarmughal824 commented 3 years ago

The OOM command not allowed when used memory > 'maxmemory' error means that Redis was configured with a memory limit and that particular limit was reached. In other words: its memory is full, it can't store any new data. You can see the memory values by using the Redis CLI tool. @shaidar / @blarghmatey could you please confirm that?

shaidar commented 3 years ago

@umarmughal824 Yeah that's the problem, however the question is why are we running out of memory as it appears tasks are getting stuck in redis and every few weeks we're hitting that limit and end up having to flush the cache. I did that over the weekend and I just checked redis and it's now at 47% usage. Predicting that by early next week, we'll be at the limit again and will have to flush the cache.

pdpinch commented 3 years ago

@annagav what are we using Redis for on MicroMasters? Do you have any theories about why the data isn't getting flushed as expected?

umarmughal824 commented 3 years ago

@umarmughal824 Yeah that's the problem, however the question is why are we running out of memory as it appears tasks are getting stuck in redis and every few weeks we're hitting that limit and end up having to flush the cache. I did that over the weekend and I just checked redis and it's now at 47% usage. Predicting that by early next week, we'll be at the limit again and will have to flush the cache.

@shaidar I have no idea about it yet but I will troubleshoot that

umarmughal824 commented 3 years ago

@annagav what are we using Redis for on MicroMasters? Do you have any theories about why the data isn't getting flushed as expected?

@pdpinch Redis is working as a queue for background job (celery tasks) handling. Redis queue queues the tasks and runs them one after the other in the background without interrupting the user’s view.

umarmughal824 commented 3 years ago

@shaidar that's why the documentation of redis says about it

When Redis is used as a cache, often it is handy to let it automatically evict old data as you add new data. This behavior is very well known in the community of developers, since it is the default behavior of the popular memcached system.

So, we need to add the following configuration in redis server.

  1. maxmemory-policy allkeys-lfu if redis >= 4.0
  2. maxmemory-policy allkeys-lru if redis < 4.0

We can set it either by the configuration directive using the redis.conf file, or later using the CONFIG SET <command> at runtime. But I would suggest to add it in .conf file.

that will automatically keep creating new space for new keys by evictions of old keys (by LFS (Least Frequently Used) / LRU(Least Recently Used) algorithms) rather than we have to manually do that.

Reference Link: https://redis.io/topics/lru-cache

shaidar commented 3 years ago

Updated config on Redislabs DB

shaidar commented 3 years ago

@pdpinch The config change doesn't seem to have resolved the problem as Redis is at 100% again and needs to be flushed.

umarmughal824 commented 3 years ago

@shaidar have we used that maxmemory-policy allkeys-lru as config variable?

shaidar commented 3 years ago

@umarmughal824 The Heroku apps use Redis Labs and that var can be configured on the redis DB which we have configured, but didn't seem to help.

umarmughal824 commented 3 years ago

@shaidar could you please run this command in redis-cli and confirm your maxmemory_policy value? I guess it's not been set to what we wanted it to be?

Screenshot 2021-06-17 at 18 23 54
shaidar commented 3 years ago

This is the configuration no the redis db:

Screen Shot 2021-06-17 at 8 52 53 AM
umarmughal824 commented 3 years ago

@shaidar we should use allkeys-lru that is working fine as I have confirmed it by local troubleshoot and testing. But allkeys-lfu is not working as expected.

Use that config variable CONFIG SET maxmemory-policy allkeys-lru

shaidar commented 3 years ago

@umarmughal824 Changed to allkeys-lru

umarmughal824 commented 3 years ago

@pdpinch I don't see any recent occurrence of that error for one month after setting that environment variable. So, this issue should be closed now.

@shaidar do you have any concerns about that?