Closed holtgrewe closed 1 month ago
I don't think this is a memory leak - I think this is the default behavior of redis (maxmemory 0
, which means unlimited).
https://redis.io/topics/lru-cache
I have had success with the following two settings...
maxmemory
- set to a real amount, based on the hardware you havemaxmemory-policy
- set the eviction policy to allkeys-lru
If the automated ingest tool gets a cache-miss due to prior key eviction, then it will check with iRODS (and populate the cache again) - which is still okay, just not as efficient as a cache-hit.
Increasing the memory of the machine will just give you access to a bigger cache.
Thanks for the heads-up. Maybe this could go somewhere in a "note well" section of the icai README?
yes, it's always a balance to document other projects :)
I agree and I'm also facing expert's dilemma regularly.
Thanks for adressing this. The ordinary icai user might be an expert on iRODS and Python and even celery but might not have worked with redis. I'm using redis in caching for Django applications regularly, for example, but have never faced the mentioned problem there.
For posterity - the largest installation I've used this for to date...
maxmemory=150G
maxmemory-policy=allkeys-lru
In addition to whatever cool tricks @trel is going to document, we can also just link to the Redis documentation: https://redis.io/docs/latest/operate/oss_and_stack/management/admin/
There is a section called "Memory" which gives a few pointers which I will paste here just in case it goes away:
Memory
- Ensured that swap is enabled and that your swap file size is equal to amount of memory on your system. If Linux does not have swap set up, and your Redis instance accidentally consumes too much memory, Redis can crash when it is out of memory, or the Linux kernel OOM killer can kill the Redis process. When swapping is enabled, you can detect latency spikes and act on them.
- Set an explicit maxmemory option limit in your instance to make sure that it will report errors instead of failing when the system memory limit is near to be reached. Note that maxmemory should be set by calculating the overhead for Redis, other than data, and the fragmentation overhead. So if you think you have 10 GB of free memory, set it to 8 or 9.
- If you are using Redis in a write-heavy application, while saving an RDB file on disk or rewriting the AOF log, Redis can use up to 2 times the memory normally used. The additional memory used is proportional to the number of memory pages modified by writes during the saving process, so it is often proportional to the number of keys (or aggregate types items) touched during this time. Make sure to size your memory accordingly.
- See the LATENCY DOCTOR and MEMORY DOCTOR commands to assist in troubleshooting.
There's also a section in the README that we've added ~in the semi-recent(ish??) past~: https://github.com/irods/irods_capability_automated_ingest?tab=readme-ov-file#starting-redis-server (See also: https://github.com/irods/irods_capability_automated_ingest/issues/78)
Added a section for Redis configuration which links to the official documentation for memory management.
I'm trying to ingest tens of thousands of files.
redis starts to use more and more memory until it all memory is used (there appears to be a "sync time" key for each ingested file) and it eventually fails with the following in redis log
This makes the ingest code crash with the following: