stefankoegl / kdtree

A Python implementation of a kd-tree
ISC License
365 stars 118 forks source link

search_knn appears to be wrong #18

Closed vincefernando closed 9 years ago

vincefernando commented 10 years ago

p = [2, 3, 4, 5, 6] B=[[1, 2, 3, 4, 5],[1.1, 2.1, 3.1, 4.1, 5.1],[1.2, 2.2, 3.2, 4.2, 5.2],[1.3, 2.3, 3.3, 4.3, 5.3], [1.4, 2.4, 4.3, 4.4, 5.4], [1.5, 2.5, 3.5, 4.5, 5.5],[1.6, 2.6, 3.6, 4.6, 5.6], [1.7, 2.7, 3.7, 4.7, 5.7],[1.8, 2.8, 3.8, 4.8, 5.8],[1.9, 2.9, 3.9, 4.9, 5.9]] TB = kdtree.create(B) print "The four nearest neighbours in B" k4=TB.search_knn(p,4) print "len(k4) = ", len(k4) for i in range(len(k4)): print k4[i].data

I get the ouput:

The four nearest neighbours in B len(k4) = 3 [1.9, 2.9, 3.9, 4.9, 5.9] [1.8, 2.8, 3.8, 4.8, 5.8] [1.5, 2.5, 3.5, 4.5, 5.5] [1.5, 2.5, 3.5, 4.5, 5.5]

stefankoegl commented 10 years ago

Can you please clarify what exactly you think is wrong?

vincefernando commented 10 years ago

There is a typo in the in the output. I have typed an additional fourth row: [1.5, 2.5, 3.5, 4.5, 5.5] I expected len(knn) to be four rather than three since I have asked 4 neighbours instead of three. I also expected the third and the fourth rows to be [1.7, 2.7, 3.7, 4.7, 5.7] [1.6, 2.6, 3.6, 4.6, 5.6] Have I misinterpreted search_knn?

stefankoegl commented 10 years ago

search_knn() works by performing a nearest-neighbor search while keeping track of all the other evaluated nodes and their distances. It seems that in this case, the algorithm doesn't visit more than the three returned nodes.

vincefernando commented 10 years ago

So the search_knn gives the global NN plus any local NNs in the traversal path. This could be useful in clustering applications since this would avoid recomputations of distances to certain nodes.

gallagth commented 10 years ago

I found the documentation a bit confusing, I was expecting search_knn to return the k nearest neighbors as it's stated, not the one nearest neighbor and k-1 (max) nodes visited to look for it.