thesamet / kdtree-scala

KDTree and KDTreeMap implementation in Scala
Apache License 2.0
33 stars 13 forks source link

Support approximating medians when building kd-tree #14

Open davidbkemp opened 4 years ago

davidbkemp commented 4 years ago

When building a kd-tree, you are sorting the data at every level of the recursive call to buildTreeNode. I believe this leads to at least a quadratic runtime performance.

The wikipedia entry for KD Trees mentions that

a popular practice is to sort a fixed number of randomly selected points, and use the median of those points to serve as the splitting plane

It would be nice to have this option. For my particular application, I need to build a tree of N nodes and then perform up to N range queries before discarding the tree. If building the tree is an N^2 operation, then I think I would be better off avoiding using a tree and just use a brute force approach.

thesamet commented 4 years ago

Interesting. My original application had to build the tree just once, so I never looked into optimizing the building phase. PRs will be welcome!

davidbkemp commented 4 years ago

It actually isn’t as bad as I thought. I think your implementation is no worse than N log^2N.

I may revisit this if we decide to use your library. Apparently NlogN is possible.