dgryski / go-tinylfu

TinyLFU cache admission policy
MIT License
252 stars 34 forks source link

Adaptivity #3

Open ben-manes opened 5 years ago

ben-manes commented 5 years ago

Caffeine 2.7 includes adaptive window sizing, allowing it to dynamically optimize towards recency or frequency. The below trace chains Corda's bitcoin (recency), Lirs' loop (frequency), and Corda's again. It achieves 39.5% hit rate where optimal is 40.3%, and all other policies are 20% or lower. This is using hill climbing, a decay rate to converge, and a restart threshold of 5% to detect workload changes.

mixed

aryehlev commented 1 year ago

@ben-manes hi😀 Really admire ur work on Caffiene:) i would like to implement this but feel like I'm missing something, Wouldn't that require me to change the size of the window/lru dynamically? Which will be very expensive

ben-manes commented 1 year ago

Unfortunately I don't think this project is maintained and was only for weekend fun, but since Go users might treat it as a reference I opened this issue for visibility. So you are welcome to implement this but may not be able to upstream the changes.

Since the movement of items is simple pointer manipulation the cost is very cheap. That might still be undesirable for a large cache, so the transfer is amortized across multiple calls (e.g. like Go's hashmap resize). The target it captured each sample period and then within that calls will assist in transferring the remainder, each up to a QUEUE_TRANSFER_THRESHOLD. You can look at Caffeine's climb() method for how I did it, and happy to hear suggestions for further improvements there as well.

aryehlev commented 1 year ago

Ok, thank you very much for the response:) I will try to implement this weekend With climb() function and this comment as a reference.

dgryski commented 1 year ago

I'm not using this package myself but I will gladly review and merge patches.

aryehlev commented 1 year ago

I have tried implementing these changes and I have experienced a drop in results using this benchmark. There probably is a bug in my code but maybe there is another explanation(the benchmark is using this distribution to create values https://en.m.wikipedia.org/wiki/Zipf%27s_law) I would like to try again to implement this.

@ben-manes do you think that this could possibly worsen performance using the zipfian distribution?

https://github.com/vmihailenco/go-cache-benchmark

ben-manes commented 1 year ago

yes, it could to a small degree because the climber has to explore to determine the optimal configuration. If by luck it starts at the ideal configuration, here frequency / mru biased, then that is a cost. Ideally for a modestly sized trace it is not a very visible penalty and becomes noise over the long term. It would be very noticeable for a tiny trace where every misprediction is amplified, but that leads to a poor analysis since such cases wouldn't benefit much from caching and are unrealistic.

The Java simulator has a trace rewriter to easily combine and convert traces into an output format. That might make it easier to try different workloads as your Go simulator only needs to support a single format (e.g. Lirs' key-per-line text file).

aryehlev commented 1 year ago

@ben-manes thank you for the response I will look into it further🙏