Open dokempf opened 2 years ago
Nanoflann v1.5.0 added parallel construction support: https://github.com/jlblancoc/nanoflann/releases/tag/v1.5.0
However, my initial testing was far off the advocated speedup of 3. I got around 20% for medium core counts and the code got slower for large core counts. I am currently hesitant to include that in the code base, at least not without a user interface to control.
Performance benchmarking shows that as is, the KDTree construction is the bottleneck of the entire algorithm. A naive approach to parallelization could be to distribute the subtree construction to threads. The very coarse levels do not parallelize well in this algorithm, but a sufficiently deep tree mitigates this disadvantage.
Implementation includes two non-trivial aspects:
nanoflann
's internal data structures w.r.t. thread safety as we are operating outside ofnanoflann
's communicated thread safetly guarantee. However, we are only using the static version of nanoflann (the dynamic would definitely not be thread-safe).