dgraph-io / ristretto

A high performance memory-bound Go cache
https://dgraph.io/blog/post/introducing-ristretto-high-perf-go-cache/
Apache License 2.0
5.64k stars 374 forks source link

Question: read-your-writes consistency #102

Closed maciej closed 4 years ago

maciej commented 4 years ago

Hi!

Do you have any plans to add operations having read-your-writes consistency at any point in the future?

Thanks!

manishrjain commented 4 years ago

By read-your-writes, you mean read ALL your writes? Ristretto has admission policy so it would drop Sets, that's part of the design. However, if a key is already present in the cache, then any Sets for that key would definitely get the updated value.

maciej commented 4 years ago

@manishrjain by read-your-writes I mean:

cache.Set("key", "value", 1) // set a value
value, found := cache.Get("key")
// expecting found to be true

without sleeping as in the example here: https://blog.dgraph.io/post/introducing-ristretto-high-perf-go-cache/

negbie commented 4 years ago

+1 Currently you cannot use ristretto for deduplication with very high insert rate because set's might take a few milliseconds.

martinmr commented 4 years ago

I looked into this. Our plan was do something similar to what Caffeine is doing to get immediate writes. Caffeine uses a small LRU whose items are either accepted or rejected into the main LRU with the use of the TinyLFU policy.

However, for this to work correctly, the write and read buffers should be lossless. This is not the case currently because we are using sync.Pool to take advantage of its internal thread-local storage. So this feature has been de-prioritized until we find an optimal way to go about this.

Most of the limitations are due to the lack of some of the features that Caffeine uses in golang.

ben-manes commented 4 years ago

Read buffer should be lossy, but write buffer should not be.

I believe the difference is that Caffeine writes to the HashMap immediately, then into the buffer, replays, and evicts. This means that the cache can temporarily exceed its capacity by a small margin. The bounded write buffer adds backpressure to ensure this does not have a runaway effect.

If I understand correct, Risretto writes into a channel first and then into the map later, which means you lose visibility. I think that was to improve write performance because Golang lacks a good, concurrent map, so they took the write buffer idea as a means to avoid the coarse locking. Regardless, I think this flow should be flipped to always have a consistent map view for more obvious code / usage. If there are performance concerns those should be tackled separately.

minhaj-shakeel commented 4 years ago

Github issues have been deprecated. This issue has been moved to discuss. You can follow the conversation there and also subscribe to updates by changing your notification preferences.

drawing