Make the behavior of remove_changes_before match the description

jhelovuo / RustDDS

Rust implementation of Data Distribution Service

Apache License 2.0

330 stars 67 forks source link

Make the behavior of remove_changes_before match the description #321

Closed vE5li closed 8 months ago

vE5li commented 9 months ago

The current implementation of remove_changes_before has several problems:

the wrapping_sub results in min_remove_count being close to usize::MAX in a lot of cases
self.changes.keys().take() will iterate the start of the map, effectively discarding newer changes while keeping the older ones
the documented property of keeping a maximum of max_keep_samples is not met
(arguably less important than the rest) self.changes.split_off and std::mem::replace are always called, even if no changes should be discarded

This PR fixes the above issues while attempting to make the function easier to understand

vE5li commented 9 months ago

Changed to draft because I want to add some tests first

jhelovuo commented 9 months ago

Thank you for the improvements. We'll get back to this when you mark it ready.

vE5li commented 9 months ago

After some trial and error I found that due to the heuristic that determines the cleanup, tests are currently almost impossible to write. There are obviously workarounds, like using conditional compilation to only use the heuristic when not in a test case or adding a very big change to flush, but I feel like these are sort of hacky solutions.

My preferred option is to introduce a trait that can be used to determine when to clean the cache. This would allow the caller to clean based on other factors (e.g. RAM usage, time passed, etc.), or to always clean, like in tests.

Since that is a breaking change to the API I will propose that separately from this PR :)

That being said I think this is now ready for review

vE5li commented 9 months ago

Also not sure what is going on with the pipeline, msrv seems to fail for reasons unrelated to my PR(?) and the code coverage has been running for 2 hours already (this also happened on my local machine)

vE5li commented 9 months ago

@jhelovuo ping :)

jhelovuo commented 8 months ago

Also not sure what is going on with the pipeline, msrv seems to fail for reasons unrelated to my PR(?) and the code coverage has been running for 2 hours already (this also happened on my local machine)

Please do not worry about these. The code coverage tool may take long and produce unpredictable results even if your code is just fine. If MSRV fails, then that could be also ok.

jhelovuo commented 8 months ago

This was now merged, but I decided to make a hybrid of this algorithm and the original one.

While your implementation is very clear, it has the complexity drawback of enumerating the cache in reverse, which means work proportional to the amount of samples we are going to keep, rather than remove. In a case the cache is configured to grow large, and only few elements would be removed per call, looping through all the samples would be quite inefficient.

In any case, thank you for your improvements.