sdd / kiddo

Kiddo
Apache License 2.0
91 stars 14 forks source link

Can kiddo be used with rayon for parallel NN calculations with large point clouds? #169

Closed grosskc closed 3 months ago

grosskc commented 5 months ago

I'm considering moving a Pointcloud class I've written in python to rust for various performance reasons. I deal with very large point clouds and currently take advantage of scipy's KDTree ability to perform queries in a multithreaded way: KDTree.query(x, workers=-1) where setting workers=-1 uses all available CPU cores to perform queries over the (large) array of points in x. If I move this to rust, I'd like to use kiddo for KNN-related calculations. I noticed in your associated benchmark code that you set workers=1. So kiddo is definitely very fast compared to scipy's KDTree in single-threaded operations! But I don't know if it is possible to leverage multiple cores (e.g., using rayon) and obtain large speed-ups when dealing with 10s to 100s of millions of points. Would love to hear about this!

sdd commented 3 months ago

You can indeed query Kiddo in a multithreaded way with Rayon. I've done this myself in a few situations. Very large speedups are unlikely though - with a large tree of the kind of sizes that you are thinking of, much of the tree will not fit in the CPU cache and so the main performance bottleneck will be bandwidth / latency to main memory. Querying any KD tree requires a lot of random rather than sequential access, and the memory bandwidth will be shared across several or all cores depending on your CPU / architecture. The best thing to do in this case is to choose the smallest types that you can get away with so that as much of the tree fits in the CPU cache as possible. The more nodes that can fit in the CPU L3 cache, the better performance you will get.