KristofferC / NearestNeighbors.jl

High performance nearest neighbor data structures (KDTree and BallTree) and algorithms for Julia.
Other
413 stars 65 forks source link

KDTree: Wrong results for non-Euclidean metrics #169

Closed aplavin closed 9 months ago

aplavin commented 9 months ago

I noticed a bug where NN.jl KDTree returns a clearly wrong result. I think it can happen whenever the metric is not Euclidean, cleanest reproduction is with WeightedEuclidean:

julia> m = WeightedEuclidean([1e-5, 1])
julia> data = [
       0 0 1
       0 1 0.
   ]

julia> tree = KDTree(data, m, leafsize=1)

julia> p = [1,0.9]

# NN.jl returns #3 as the clostest point
julia> nn(tree, p)
(3, 0.9)

# but actually #2 is much closer:
julia> m(data[:, 3], p)
0.9
julia> m(data[:, 2], p)
0.10004998750624607