Open ben-manes opened 5 years ago
@ben-manes hi😀 Really admire ur work on Caffiene:) i would like to implement this but feel like I'm missing something, Wouldn't that require me to change the size of the window/lru dynamically? Which will be very expensive
Unfortunately I don't think this project is maintained and was only for weekend fun, but since Go users might treat it as a reference I opened this issue for visibility. So you are welcome to implement this but may not be able to upstream the changes.
Since the movement of items is simple pointer manipulation the cost is very cheap. That might still be undesirable for a large cache, so the transfer is amortized across multiple calls (e.g. like Go's hashmap resize). The target it captured each sample period and then within that calls will assist in transferring the remainder, each up to a QUEUE_TRANSFER_THRESHOLD
. You can look at Caffeine's climb() method for how I did it, and happy to hear suggestions for further improvements there as well.
Ok, thank you very much for the response:) I will try to implement this weekend With climb() function and this comment as a reference.
I'm not using this package myself but I will gladly review and merge patches.
I have tried implementing these changes and I have experienced a drop in results using this benchmark. There probably is a bug in my code but maybe there is another explanation(the benchmark is using this distribution to create values https://en.m.wikipedia.org/wiki/Zipf%27s_law) I would like to try again to implement this.
@ben-manes do you think that this could possibly worsen performance using the zipfian distribution?
yes, it could to a small degree because the climber has to explore to determine the optimal configuration. If by luck it starts at the ideal configuration, here frequency / mru biased, then that is a cost. Ideally for a modestly sized trace it is not a very visible penalty and becomes noise over the long term. It would be very noticeable for a tiny trace where every misprediction is amplified, but that leads to a poor analysis since such cases wouldn't benefit much from caching and are unrealistic.
The Java simulator has a trace rewriter to easily combine and convert traces into an output format. That might make it easier to try different workloads as your Go simulator only needs to support a single format (e.g. Lirs' key-per-line text file).
@ben-manes thank you for the response I will look into it further🙏
Caffeine 2.7 includes adaptive window sizing, allowing it to dynamically optimize towards recency or frequency. The below trace chains Corda's bitcoin (recency), Lirs' loop (frequency), and Corda's again. It achieves 39.5% hit rate where optimal is 40.3%, and all other policies are 20% or lower. This is using hill climbing, a decay rate to converge, and a restart threshold of 5% to detect workload changes.