KristofferC / NearestNeighbors.jl

High performance nearest neighbor data structures (KDTree and BallTree) and algorithms for Julia.
Other
425 stars 66 forks source link

[Question] Using a skip function that requires data point index for multiple point knn #105

Closed Datseris closed 4 years ago

Datseris commented 4 years ago

Hi there, I am using the syntax

knn(tree, points, k, true)

to get the NN of my points, which is a Vector(SVector), because the docs indicate this is the most performant version to do it. points are all points that also exist inside the data that made the tree. I of course need to exclude all these points from my found neighbors, but I get around this by findin the k+1 neighbors and skipping the first (since I true for the sort argument).

I need to use a skip clause for neighbors, but unfortunately the skip clause does not depend on the points themselves, but on the indices.

Specifically, assume that points[1] has index n in the original data. When I am searching for neighbors of points[1], I want to be able to exclude neighbors that have indices n \pm w, where w a small integer, typically 1 or 2.

I know how to do this using the syntax knn(tree, point, k, true), i.e. doing one search at a time.

Do you have any idea on how to do this for multiple points?

Datseris commented 4 years ago

I've seen the source code of knn, and specifically here the "optimized version":

https://github.com/KristofferC/NearestNeighbors.jl/blob/master/src/knn.jl#L16-L26

Based on this, I wrote up a simple modified loop that goes like this:

function find_all_neighbors(vtree, vs, ns, K, w)
    k, sortres, N = maximum(K), true, length(vs)
    dists = [Vector{eltype(vs[1])}(undef, k) for _ in 1:N]
    idxs = [Vector{Int}(undef, k) for _ in 1:N]
    for i in 1:N
        # The skip predicate also skips the point itself for w ≥ 0
        skip = j -> ns[i] - w ≤ j ≤ ns[i] + w
        NearestNeighbors.knn_point!(vtree, vs[i], sortres, dists[i], idxs[i], skip)
    end
    return idxs
end

I think it makes sense. If anyone confirms that this indeed is the way I should do it, let me know and I can close this!

KristofferC commented 4 years ago

Yeah, the skip function didn't turn out too great (I feel this always happens when you merge something you aren't 100% on heh). Something like what you do there makes sense but it is of course unfortunate that you need to use internal things like knn_point!.

Datseris commented 4 years ago

It's okay, it's not a big deal. True, I use something internal, but that's the reason that I wanted you to confirm. Since I am not sure about knn_point!, I wanted the expert to give the a-ok. Now all is good!