Open dbaarda opened 5 years ago
but it doesn't tell you if T should be increased or decreased when it is not working well.
Naive question but what if we randomly either increase or decrease T
and if fit
is regressing perform the opposite operation. Too much brute-force?
This is the "periodic probe" idea, where you periodically try "moving the dial" to see what happens so you can figure out where best to set the dial. This is something that things like the BBR TCP congestion control algorithm uses to great affect.
Personally I'm not a fan of this approach because during the probe you hurt performance. I also don't like having multiple modes of operation and step-changes for several reasons.
For this particular application it's got extra problems. With BBR the affect of turning on the probe is immediately apparent, and it goes away almost immediately when you stop probing. With ADLFU changing the decay rate takes a long time to impact the cache, and by the time it's noticeable it's effectively "poisoned" the cache history so the impact lingers long after the probe is turned off.
I'd much rather find some kind of cache metric that can be used to drive a PID controller in a continuously operating mode without probes. I'm pretty sure this can be done, but I need to do some more testing to find the right metric.
This requires good feedback on the cache performance that indicates if a higher or lower T value would perform better.
It's complicated partly because it's nearly impossible to know what the optimal performance of the cache should be, and the signals available when we only keep the decaying hit-count for each entry doesn't give a lot of signal. The optimal hit-rate depends on the load
One possible measure of the cache performance is "fit", where we keep a decaying average of the hit-count values for cache hits. If the current cache hits reflect the current distribution of entry counts, then the average hit-count should be;
Where;
in practice this does seem to give a fairly good indication of when the cache is working well, but it doesn't tell you if T should be increased or decreased when it is not working well.