Open N-R-K opened 5 months ago
the memmove worst case [...]
I was thinking, maybe it might be worth implementing tombstone now and instead of memmoving, simply marking the older entry as "dead" to be removed later.
Though I'd assume that the linear search is probably the bigger issue. Maybe maintaining a hashmap of hash -> index
? That seems fairly complicated though given the existing clip-store design.
A simpler method might be to have a probabilistic cache of hash -> index
instead. It'd give us an index to start the (circular) search. This way we won't have to worry about the cache being consistent. It can be wrong, but that just means we'll end up with a linear search. And when it's correct, we get to find the entry faster.
For now though, I haven't really noticed much of an issue in practice. Though I use a fairly low max clip size of 128. And both the linear search + memmove only ever happens when there's an actual duplicate, not on each insertion.
If performance becomes an issue, I can look into doing the probabilistic cache or tombstone, whichever is the bottleneck.
Also to note: I in general am a bit worried about performance here, so I should be clear I am not certain this will be merged yet. But also this should not be the regular case: we already do cheap "last dupe" checks up front.
I in general am a bit worried about performance here
I did a bit of micro benchmarking a couple days ago. It's not super scientific but the observation are as follows:
These numbers are with the default of 1k max clips.
All of this seems to suggest to me that the linear search is not the bottleneck but the memmove
is. So doing tombstone is probably more important for performance. It'd be a bit more invasive change to iteration functions and removal functions, but I think it can be done if needed.
The rest of the review all seems very reasonable. I will amend the patch and address them sometime tomorrow. Cheers.
I think that sounds fine: my main concern was the nominal case (dupe at head), but that's already handled by is_possible_partial in a faster way, so that shouldn't present a problem. Thanks!
uses a linear search to find the duplicate and a memmove to move it over to the end.
not perticularly efficient since each snip is 256 bytes while we're only interested in the first 8. and so iterating over it linearly like this isn't very cache friendly. the memmove worst case can also be around 250KiB with the default config.
Closes: https://github.com/cdown/clipmenu/issues/224