al8n / stretto

Stretto is a Rust implementation for Dgraph's ristretto (https://github.com/dgraph-io/ristretto). A high performance memory-bound Rust cache.
Apache License 2.0
413 stars 28 forks source link

Dataset for tests and benchmarks #46

Open al8n opened 1 year ago

al8n commented 1 year ago

We need some datasets that can be used to give more insight into the performance and the hit ratio when we add new features.

Yiling-J commented 1 year ago

You could try some widely adopted trace, for example ds1, s3 from arc paper. From my benchmark Ristretto is not good. If Stretto's implentmention is exactly same as Ristretto, I'm interested to see the results. My cache package, with benchmark results: https://github.com/Yiling-J/theine-go

al8n commented 1 year ago

You could try some widely adopted trace, for example ds1, s3 from arc paper. From my benchmark Ristretto is not good. If Stretto's implentmention is exactly same as Ristretto, I'm interested to see the results. My cache package, with benchmark results: https://github.com/Yiling-J/theine-go

The low hit ratio for ristretto in your benchmark may be caused by the write buffer, in ristretto, if you insert an item, and then try to read this item, if the item is still in the write buffer, then you will get a miss.

Yiling-J commented 1 year ago

I agree, but from the image in their blog post and README, the hit ratio should be higher. If you use the same technique(write to buf first), and your benchmark shows similar result, I think we can confirm that. BTW both I and ben believe that write to map first is better, you can take a look ben's reply: https://www.reddit.com/r/golang/comments/12uql3y/theine_020_released_a_generic_cache_which_has/

al8n commented 1 year ago

I agree, but from the image in their blog post and README, the hit ratio should be higher. If you use the same technique(write to buf first), and your benchmark shows similar result, I think we can confirm that. BTW both I and ben believe that write to map first is better, you can take a look ben's reply: https://www.reddit.com/r/golang/comments/12uql3y/theine_020_released_a_generic_cache_which_has/

Yeah, writing to map first is better. I was thinking of having a method that lets the cache can read the item from the write buffer, e.g. add an Arc for the item, and have a hashmap to store the item in the buffer and remove it when the item is handled. But I do not think the idea is good enough. I have appreciated it if there is any idea about this feature.

Yiling-J commented 1 year ago

I think it's just switching the order, first writing to map, then adding to write buffer. I think Ristretto write to buffer first because write buffer is a channel, they drop some Sets under high concurrency when channel is full. This improve write performance, but maybe not the excepted behavior for Ristretto users.