abey79 / vsvg

Fast and portable tools for plotter users
http://whisk.rs
MIT License
107 stars 12 forks source link

Path Index implementation #12

Open abey79 opened 1 year ago

abey79 commented 1 year ago

Like when I first implemented that in vpype, I come to the conclusion that the remove operation on a spatial index is way too costly. So back to an occupancy bool vector and occasional reindexing. This time, I'll try to optimise the reindex trigger based on benchmarks.

As for crates: kiddo 2 seems interesting but still too buggy in 2.0.0-beta.5. I'll use kdtree from the time being. Maybe there are faster option given that I dont need remove?

abey79 commented 1 year ago

Is IndexMap still needed if I have an occupancy vector?? Probably not! A Vec<Option<&PathImpl>> should work just as well. TBD.

Side note: Initial profiling indicates that swap_remove is considerably faster than shift_remove for IndexMap.

abey79 commented 1 year ago

Initial take at occupancy vector: pop_nearest count the number of miss (e.g. kd-tree returning a position that belongs to a path that was already removed), and the index is rebuilt when this number reaches a threshold. It appears that about 20-40% of the initial total path count is a good starting point:

path_index_benches

linesort algorithm n = number of random paths

abey79 commented 1 year ago

Maybe slightly better when the threshold is adapted based on the remaining count upon index rebuild:

path_index_benches

This is valid as of 7027afc4808ad713db7bea5468cbb3956ac17f62

abey79 commented 1 year ago

I did a quick trial by triggering on number of deletion instead of missed queries, and it appears often worse, rarely better, so I'm discarding the idea: path_index_benches

abey79 commented 1 year ago

Quick benchmark with a bigger set of lines. 40% seems good.

EDIT: this was with threshold on remove instead of missed access :/

image
abey79 commented 1 year ago

With threshold based on misses:

image