whitfin / cachex

A powerful caching library for Elixir with support for transactions, fallbacks and expirations
https://hexdocs.pm/cachex/
MIT License
1.59k stars 102 forks source link

Define cache size limit #49

Closed emerleite closed 8 years ago

emerleite commented 8 years ago

I guess we should have a way to determine cache size limit, to avoid reach server RAM limit. I could not find any options to define this.

whitfin commented 8 years ago

@emerleite I've been considering this for a while, but I don't know an efficient way to trim the size, and on what grounds.

The most obvious is LRU, but that requires a sorted table in order to easily determine use time. Any ideas? Perhaps LFU should also be an option... that seems to make the most sense in context.

In the meantime, just set TTLs on your keys - that should be an easy workaround.

whitfin commented 8 years ago

Further thinking on this makes me feel that we can implement LRU with a hook. There will naturally be overhead to this, but it can be toggled on/off if you have a clever enough eviction policy. I just need to work out some of the QLC magic behind it.

emerleite commented 8 years ago

I set the TTL, but with high throughput, I'm not sure if I'll reach the limit anyway.

emerleite commented 8 years ago

@zackehh I guess LFU could be better, but we could have a simple option to just rotate, regardless of the policy.

emerleite commented 8 years ago

I found this project. Do you think we can go through the same aproach? - https://github.com/arago/lru_cache

whitfin commented 8 years ago

@emerleite we can't, because using something like an Agent is a 6x slowdown on all reads against the throughput Cachex has today.

The reason I never added LRU before is simply because it would typically require a write on every read (like that Agent approach). I have an idea, but need to find some time to try an implementation.

emerleite commented 8 years ago

@zackehh makes sense. But how about just set a limit and remove the first inserted? It's just to not reach system limit.

whitfin commented 8 years ago

@emerleite first inserted is possible, but arguably the things you put in a cache instantly and live the longest are also those which need to be cached the most.

whitfin commented 8 years ago

@emerleite how would you imagine this working in the case where you have a distributed cache?

emerleite commented 8 years ago

@zackehh not sure. It's true we have to figure out this.

whitfin commented 8 years ago

Coming back to this;

I have a plan to implement this with a hook, but it won't work on a distributed scale (at least I don't think so). I'm going to investigate a way around this in future, perhaps by splitting out the remote caches into their own project.

whitfin commented 8 years ago

@emerleite I opened a new issue in #55. I'm going to close this one but feel free to follow that issue instead.

emerleite commented 8 years ago

Coll @zackehh ;)