patrickmn / go-cache

An in-memory key:value store/cache (similar to Memcached) library for Go, suitable for single-machine applications.
https://patrickmn.com/projects/go-cache/
MIT License
8.16k stars 874 forks source link

Feature request: max size and/or max objects #5

Open ncw opened 11 years ago

ncw commented 11 years ago

To make go-cache as useful as memcache it needs an upper limit on the amount of memory that it can use otherwise it is easy to cache too many things and explode the memory usage.

Ideally it would have a memory limit like memcache does. It could be somewhat approximate as I'm sure it isn't particularly easy to account for memory used in the cache.

A max objects limit as well would probably be useful.

patrickmn commented 11 years ago

I agree it would be useful, but as it stands it would be fairly difficult as you suggest. go-cache, unlike memcached, doesn't process/serialize most of the data, but rather keeps around a pointer to it. Of course, while this makes go-cache orders of magnitude faster, it also means that it isn't distributed in any way (like memcached is), and, unfortunately, that it can't know the size of what it's storing without some serious reflect/unsafe magic/overhead.

The best solution would probably be to make a new cache type that stores items that satisfy a Sizeable interface where foo.Size() represents its relative size as an integer. This is how Vitess' LRU cache does it: https://code.google.com/p/vitess/source/browse/go/cache/lru_cache.go This would transfer that responsibility to the user. I'm not totally sure if users can easily determine this in many cases, i.e. when they're storing something more complex than e.g. a struct with a []byte field or something, like a map of structs or slices.

Another solution would be to simply limit the number of items in the cache at any given time, but that assumes that every cache item is roughly the same size. The biggest problem with this is choosing the right items to expunge when necessary. Such a cache would probably end up becoming a clone of LRU (i.e. it keeps track of when items were last accessed, and uses something other than/in addition to a map for the storage) where every item has the same size, e.g. 1.

I will keep thinking about the best way to approach this in go-cache, but my gut feeling right now:

ncw commented 11 years ago

Thanks for the detailed reply and the link to the Vitess cache.

I think you are right in that to do the sizing of objects would require the user to help. I was naively thinking that it could be introspected but you are right, the objects might be pointers to arbitrarily complicated structures which may or may not need to be accounted for in the cache. The user could have a go at estimating the size with a Sizeable interface as you suggested.

I still think a total objects limit would be useful. Let's say you are storing user sessions for 24 hours. However you get on the wrong end of a bot-net which makes a lot of sessions in a very short period of time. That is the kind of scenario that the memory usage explodes. Flushing user sessions wouldn't be ideal in this case but it would keep the server running rather than going out of memory.

Memcached could certainly do the job in this case, but I like the idea of doing it really efficiently with go-cache.

htoothrot commented 11 years ago

I would like to note that I too believe a total number of objects limit would be useful.

patrickmn commented 11 years ago

Sorry for the delay. I am still thinking about how to do this nicely, namely how to evict items the best way without implementing a copy of Vitess (and the overhead of maintaining an extra list of items/refreshing items on Get.) I agree that it's troublesome if somebody explodes your number of items.

miekg commented 8 years ago

Doesn't Redis just randomly evict things from the cache? I think I care more about my cache being memory bounded then evicting "the wrong" element. Maybe we can make the actual algo for evicting pluggeable and start with random eviction was does not need fancy book keeping.

patrickmn commented 7 years ago

I like the LRU approach best but with that the concern is overhead for read-heavy loads, and the memory usage of the Ctime/Atime for each cached item.

Random eviction could work, but I don't know if there's a reliable way to implement it, given that the underlying storage is map. Map traversal is unpredictable but not truly random, so you can't just traverse, pick the first key, and delete it: https://play.golang.org/p/lUw_1oPTLB But perhaps that is less of a concern for larger maps...

cognusion commented 7 years ago

Random eviction is fine for lazy caches but when your cache is read-intense, the objects are expensive to generate, and you end up evicting some hot items (and thus immediately cache-miss), it hurts.

patrickmn commented 7 years ago

Interesting timing; this just hit HN: https://danluu.com/2choices-eviction/

We could conceivably have both options, but LRU still seems best given that we don't have a good way of selecting random items from the cache without maintaining some parallel data structure.

chris-pikul commented 7 years ago

I'm also going to vote for this feature. I'll still use this, but with major caution. A bot-attack would kill my service without the upper limit.

I was going to use this for caching user objects in relation to session keys so I don't have to hit my DB on every request. So expiring time is great for this. I'd be OK with randomly dropping entries if over size. It would be even better if it just dumped the oldest object, but I imagine there'd be a performance hit on that.

rfyiamcool commented 6 years ago

good idea, Bug I think design limit max memory is better .

v-antech commented 4 years ago

This seems like a major missing feature. There are use cases where large eviction times totally make sense, but the cache must be memory bound, otherwise the service will eventually run out of memory and die. Any sort of limitation in that regard is much better than nothing at all IMO. I added this cache to our system with great ease, but will probably try to replace it with another one that supports this vital feature.

aaomidi commented 2 years ago

I've started a PR #146

I'll write up some more benchmarking support for this, but it should be a pretty good way of limiting growth. Since it's a cache, invisibly failing writes is not that big of a deal.