stefankoegl / kdtree

A Python implementation of a kd-tree
ISC License
365 stars 118 forks source link

fix search_knn issue #22

Closed longwosion closed 9 years ago

longwosion commented 9 years ago

The search_knn function does not return k nearest neighbors, when we go up the tree, we need compare bestDist with another child-tree and decide search or not more better neighbors in that child-tree.

But, your current implementation, calculate bestDist by nearest point in result array, I think we should calculate it by most far point in result array.

In current implementation, only nearest point will be replaced with more good found, but the 2nd nearest or K-th nearest will not be replaced with more good found.

I will use a simple data to example this issue:

# three points in 2D
points = [(50, 20), (51, 19), (1, 80)]
tree = kdtree.create(points)
point = (48, 18)

kdtree.visualize(tree)

#            (50, 20) -A
#
#      (1, 80)-C    (51, 19) -B
# we mark them with label A, B, C, and D = (48, 18)

all_dist = []
for p in tree.inorder():
    dist = p.dist(point)
    all_dist.append([p, dist])

all_dist = sorted(all_dist, key = lambda n:n[1])

# we could know that:
# distance(A, D) = 8
# distance(B, D) = 10
# distance(C, D) = 6053

def print_node(a):
    print "--------------------------"
    for node, d in a:
        print node.data, d

print_node(tree.search_knn(point, 1))
# should got point A

print_node(tree.search_knn(point, 2))
# should got point A, B, but we got A, C

print_node(tree.search_knn(point, 3))
# should got point A, B, C
stefankoegl commented 9 years ago

Thanks!