maypok86 / otter

A high performance cache for Go
https://maypok86.github.io/otter/
Apache License 2.0
1.63k stars 39 forks source link

advised to add some methods #27

Closed Nyx2022 closed 7 months ago

Nyx2022 commented 8 months ago

Clarification and motivation

i wanna add some method to otter to use more conveniently

Acceptance criteria

add three methods i'm thinking now

  1. getKeys() get all keys as a slice of otter
  2. getVals() get all vals as a slice of otter
  3. setIfAbsent(key,val) with a bool as return result,this can set a new value when the key is not in otter,but not update valu when the key is not in otter
maypok86 commented 8 months ago

Hi, thanks for the issue!

I think putIfAbsent will be available relatively soon (it's on hold for now due to holidays and preparation for the first release), but I'm not sure about getAllKeys and getAllValues. Those operations are very heavy and would have to be handwritten for the hash table used in otter, and I'm not sure if users really want to get all keys or values from the cache very often.

To summarize, I don't mind them, but getAllKeys and getAllValues operations have a very low implementation priority.

sgreben commented 8 months ago

@maypok86 how about a Range(f) like xsync MapOf?

This would allow some commonly needed operations (such as serialization, or cache purging by item value) that you might not want to pull into the library itself to be implemented outside of it.

Here's my port of the MapOf.Range code from xsync to the otter hashtable.

maypok86 commented 8 months ago

@sgreben Yes, that looks like a good idea (I was thinking of also implementing getAll functions based on it). Right now I'm completely absorbed with seqlock implementation in go and researching fantastic unexpected things that go's optimizer gives out.

UallenQbit commented 7 months ago

嗨,谢谢你的问题!

我认为 putIfAbsent 很快就会可用(由于假期和第一个版本的准备工作,它暂时搁置),但我不确定 getAllKeys 和 getAllValues。这些操作非常繁重,必须为 Otter 中使用的哈希表手写,我不确定用户是否真的想经常从缓存中获取所有键或值。

总而言之,我不介意它们,但 getAllKeys 和 getAllValues 操作的实现优先级非常低。

getAllKeys and getAllValues are really not used often, but they are really useful, I choose to use fastcache / sync.map to solve getAllKeys and getAllValues I'm really looking forward to otter having getAllKeys and getAllValues

maypok86 commented 7 months ago

@UallenQbit It is very interesting to know the reason for this necessity. The need to serialize the cache in some way I already understood, but Range solves this problem and does it much more efficiently.

maypok86 commented 7 months ago

The choice between fastcache and sync.Map is a bit strange. They seem to be designed for completely different tasks (I'm not even happy with fastcache's inclusion in benchmark results, but I've been asked for it a lot, and ristretto compares itself to it for some reason).

fastcache is a specialized cache written for VictoriaMetrics, which sacrifices everything to be able to store gigabytes of data, because VictoriaMetrics is a metrics store designed for a hell of a load. And with the load on GC, it wouldn't be able to beat the competition. But almost no application needs this property, even Kafka, Dgraph, Vitess quietly use onheap caches, which give in return an excellent eviction policy and a set of handy features. The only limitation for such applications is that they cannot afford to use simple caches like golang-lru which completely kill application parallelism. This was the main reason why ristretto appeared, but unfortunately it has some serious problems which otter is trying to solve.

sync.Map is not a cache at all because it has no size limit, it is only a hash table whose purpose is very specific and in most cases a regular map with RWMutex can destroy it.

P.S. There are also many libraries in go claiming to be faster than ristretto. This is actually true, but the authors don't specify why.🙂 They are faster only because they cut the eviction policy into shards, so they don't need to spend time on synchronization of global eviction policy (it's a very complicated operation). Hit ratio from such cut eviction policy drops very much, because in real life some of the shards will always be hot and evict necessary items, and the rest are cold and keep unnecessary items. You won't see hit ratio measurements in such libraries, of course.😔

maypok86 commented 7 months ago

Okay, I'll get to the SetIfAbsent and Range methods after this issue is closed. Seqlocks did not satisfy me unfortunately, but RCU unexpectedly shows good results.

maypok86 commented 7 months ago

Range is ready, but I need to think about SetIfAbsent, I don't really like the fact that otter will probably need SetIfAbsentWithTTL as well.

maypok86 commented 7 months ago

Otter now also supports SetIfAbsent of two kinds

If you use a cache without or with a constant ttl, you can use the SetIfAbsent(key, value) function.

cache, err := otter.MustBuilder[string, string](10_000).WithTTL(time.Hour).Build()
if err != nil {
    panic(err)
}
cache.SetIfAbsent("key", "value")

If you use cache with variable ttl, you can use SetIfAbsent(key, value, ttl) function.

cache, err := otter.MustBuilder[string, string](10_000).WithVariableTTL().Build()
if err != nil {
    panic(err)
}
cache.SetIfAbsent("key1", "value1", time.Hour)
cache.SetIfAbsent("key2", "value2", time.Minute)

All thanks to a more advanced API that allows you to hide implementation details from the user.

maypok86 commented 7 months ago

If you really need getAll functions, it's better to write one🙂. But it seems that cases when you really need them are quite rare and it's easier to use Range, which is needed in many more situations.