Closed vE5li closed 8 months ago
Changed to draft because I want to add some tests first
Thank you for the improvements. We'll get back to this when you mark it ready.
After some trial and error I found that due to the heuristic that determines the cleanup, tests are currently almost impossible to write. There are obviously workarounds, like using conditional compilation to only use the heuristic when not in a test case or adding a very big change to flush, but I feel like these are sort of hacky solutions.
My preferred option is to introduce a trait that can be used to determine when to clean the cache. This would allow the caller to clean based on other factors (e.g. RAM usage, time passed, etc.), or to always clean, like in tests.
Since that is a breaking change to the API I will propose that separately from this PR :)
That being said I think this is now ready for review
Also not sure what is going on with the pipeline, msrv
seems to fail for reasons unrelated to my PR(?) and the code coverage has been running for 2 hours already (this also happened on my local machine)
@jhelovuo ping :)
Also not sure what is going on with the pipeline, msrv seems to fail for reasons unrelated to my PR(?) and the code coverage has been running for 2 hours already (this also happened on my local machine)
Please do not worry about these. The code coverage tool may take long and produce unpredictable results even if your code is just fine. If MSRV fails, then that could be also ok.
This was now merged, but I decided to make a hybrid of this algorithm and the original one.
While your implementation is very clear, it has the complexity drawback of enumerating the cache in reverse, which means work proportional to the amount of samples we are going to keep, rather than remove. In a case the cache is configured to grow large, and only few elements would be removed per call, looping through all the samples would be quite inefficient.
In any case, thank you for your improvements.
The current implementation of
remove_changes_before
has several problems:wrapping_sub
results inmin_remove_count
being close tousize::MAX
in a lot of casesself.changes.keys().take()
will iterate the start of the map, effectively discarding newer changes while keeping the older onesmax_keep_samples
is not metself.changes.split_off
andstd::mem::replace
are always called, even if no changes should be discardedThis PR fixes the above issues while attempting to make the function easier to understand