karlseguin / ccache

A golang LRU Cache for high concurrency
MIT License
1.31k stars 121 forks source link

Potential Memory Leak in CCache under Load Tests #94

Open biancapetrica opened 3 weeks ago

biancapetrica commented 3 weeks ago

During memory usage profiling tests of both CCache V2 and V3, I observed an increase in memory usage beyond expected levels, even when the cache size is capped at 1 million elements. This behavior is not observed in comparison with Ristretto, where the memory usage remains stable once the cache size limit is reached.

The increasing memory usage suggests a potential memory leak in CCache. Below are the test results for both CCache versions and Ristretto for comparison.

Steps to Reproduce:

  1. Set up a load test where the cache size is capped at 1 million elements.
  2. Insert items into the cache, gradually increasing up to 10 million items.
  3. Monitor the memory usage during the process.

CCache V2:

100k items -> Memory usage: 0.048 GB
1M items   -> Memory usage: 0.500 GB
1.5M items -> Memory usage: 0.668 GB (cache len 1M, drops 500k)
2.5M items -> Memory usage: 1.046 GB (cache len 1M, drops 1.5M)
5M items   -> Memory usage: 1.974 GB (cache len 1M, drops 4M)
10M items  -> Memory usage: 3.829 GB (cache len 1M, drops 9M)

CCache V3:

100k items -> Memory usage: 0.046 GB
1M items   -> Memory usage: 0.478 GB
1.5M items -> Memory usage: 0.645 GB (cache len 1M, drops 500k)
2.5M items -> Memory usage: 1.024 GB (cache len 1M, drops 1.5M)
5M items   -> Memory usage: 1.951 GB (cache len 1M, drops 4M)
10M items  -> Memory usage: 3.806 GB (cache len 1M, drops 9M)

Ristretto:

100k items -> Memory usage: 0.021 GB
1M items   -> Memory usage: 0.220 GB
1.5M items -> Memory usage: 0.239 GB (cache len 1M, evictions 500k)
2.5M items -> Memory usage: 0.230 GB (cache len 1M, evictions 1.5M)
5M items   -> Memory usage: 0.264 GB (cache len 1M, evictions 4M)
10M items  -> Memory usage: 0.355 GB (cache len 1M, evictions 9M)

While Ristretto’s memory usage remains capped when the cache reaches 1 million items, CCache V2 and V3 show significant memory growth even though the cache length is restricted to 1 million items.

Expected Behavior: Memory usage should stabilize once the cache reaches the set limit of 1 million items, similar to the behavior seen in Ristretto.

Actual Behavior: Memory usage continues to increase in CCache V2 and V3, suggesting a potential memory leak; items dropped from the cache may not be properly cleaned up, resulting in continuous memory growth. If any further tests or logs are needed, I would be happy to provide them.

karlseguin commented 2 weeks ago

Sorry for the delay. I took a look at this yesterday, and I agree with the conclusion.

What I think is happening is that the underlying map can grow beyond the specified max size. Specifically, because the cleaner is running in a background goroutine, the key=>value map can grow to be quite a bit larger than the maximum size. It's possible to insert faster than the cache can enforce the limit. Go's map implementation doesn't free memory, so any spike in size stays. This can be made worse via bad key distribution, where one bucket might grow larger.

I'm not positive this is the only issue though, since it doesn't seem to plateau (though the leak does seem to slow down, so maybe it eventually does).

ccache is pretty old. It's interesting to revisit the algorithm, improve the performance, memory footprint and these types of issues. But, there are a lot of newer and better alternatives available nowadays, so I'm not sure it's really worth it.

biancapetrica commented 2 weeks ago

Thanks for looking into this and sharing your insights! Do you have any recommendations for alternative cache implementations that could serve well as a layered cache in place of CCache? I'd be interested in exploring options with a strong focus on memory efficiency and performance.

karlseguin commented 2 weeks ago

I don't have any recommendation for a cache that allows specifying a group key, sorry.

phuslu commented 1 week ago

If you are interest about "memory efficient" cache in golang, please let me recommend my https://github.com/phuslu/lru

It's the most minimized and efficient memory implementation in go.

PS: there is a nim-lang port in https://github.com/status-im/nim-minilru and it explains the reason.