stefankoegl / kdtree

A Python implementation of a kd-tree
ISC License
365 stars 118 forks source link

Optimisations for knn search #26

Closed denbeigh2000 closed 9 years ago

denbeigh2000 commented 9 years ago

Hi,

I've been using this KDTree implementation for a simple KNN classifier, and made a small optimisation to speed up the KNN search.

I noticed in KDNode._search_node(), you sort the nodes by distance, which you already calculate and store in the results dict. The first commit changes the sorting operation to use the already calculated values, rather than recalculating the point-point distance for each item.

I ran it with my data set for my classifier, which reads a file, performs 500 inserts and then 268 KNN searches, and got my total runtime down from 9.8s to 2.8s(!!!!!!)

The second commit avoids doing this sort altogether if no nodes have been added to the list, and uses the previous values of bestNode and bestDist, as the values will not have change. This optimisation is a lot smaller, but could be useful for some search edge cases. In my particular data set it shaved an extra .5s off my already reduced search time, bringing me down to 2.3s.

Hope this speeds up your implementation of whatever you use this for :)

Denbeigh

stefankoegl commented 9 years ago

Thanks a lot for your contribution!

stefankoegl commented 9 years ago

I've just released 0.12 which contains your changes.

denbeigh2000 commented 9 years ago

Thanks mate! :)