jsocol / django-ratelimit

Cache-based rate-limiting for Django
https://django-ratelimit.readthedocs.io/en/latest/
Other
1.06k stars 188 forks source link

Use Redis Cache Expiration to Minimize Memory Usage (or Encourage a Cron job) #130

Closed mlissner closed 6 years ago

mlissner commented 6 years ago

Unless I'm missing something, enabling this app will slowly but steadily take up more and more memory/hd space as more and more IP addresses visit your site. I can see two solutions for this:

  1. For those that use Redis for their cache, we can use the EXPIRE command to make it so keys go away when they're no longer relevant. For example, if the rate is 5/hour and the last time an IP hit was an hour ago, we can expire the key.

  2. For those that don't use Redis, we should encourage a cron job, similar to the one that's used to clear old old sessions. The cron job could just run a management command that clears out the entire cache. Run it once a week, and it should be OK for most people.

As it is now, we have neither of these things and this app is essentially a memory leak for those using an in-memory cache. Not great.

jsocol commented 6 years ago

Most in-memory caches, including Redis, can and are expected to run in an mode that will automatically evict old (LRU, LFU) items as space runs out. It's not a memory leak so much as an intentional behavior.

It might be worth adding a note to the docs that if you're using a Redis cache you'll need to examine your eviction policies—which are several settings that interact, including maxmemory and maxmemory-policy—configured appropriately.

mlissner commented 6 years ago

Hm, FWIW, we use ours for caching and for Celery tasks, so we can't use the maxmemory eviction policies and have to just be careful about what we put in Redis. Why not handle this client-side?

adamchainz commented 6 years ago

You probably want two Redises if you want one for caching where it's okay to lose the data sometimes and one for permanent data that you can never lose like celery tasks.

jsocol commented 6 years ago

Agreed with Adam.

The other part is that it's not a purely client-side choice. We do have an open PR (#112) to explicitly set the expiration, but there are a couple of nuances:

Ultimately it's not the library's job to tell you how to set up your infrastructure—but what that really means is documenting how we ultimately expect it to behave and not making assumptions in the code.

mlissner commented 6 years ago

Yeah, for me, I'd still rather that this library clean up after itself rather than rely on a cache's policy, but getting it documented would be a good step too, so it doesn't bite people like me.

jsocol commented 6 years ago

At the end of the day, the library can't configure Redis for you. If you are mixing volatile (caches) and non-volatile (celery backend) things in once instance, you'll need to understand the consequences and adjust for that. Even if every library/cache backend properly sets EXPIRE.