JuliaStats / Distances.jl

A Julia package for evaluating distances (metrics) between vectors.
Other
433 stars 98 forks source link

Computing distances for GeometryBasics points #262

Open Kevin-Mattheus-Moerman opened 8 months ago

Kevin-Mattheus-Moerman commented 8 months ago

I'm relatively new to Julia but started working a lot with GeometryBasics entities, such as their 3D point types. Here is an example to create a hand full:

using GeometryBasics
V = [ GeometryBasics.Point{3, Float64}( s,   s,   s) for s in range(-3.0,3.0,6)]

I had created my own distance function where I used something like norm(V1[i]-V2[j]) but have just updated my codes to use euclidean(V1[i],V2[j]) instead which seems more efficient. So far I have these two functions:

using GeometryBasics
using Distances

function dist(V1,V2)
    D = Matrix{Float64}(undef,length(V1),length(V2))   
    for i ∈ eachindex(V1)
        for j ∈ eachindex(V2)          
            D[i,j] = euclidean(V1[i],V2[j]) # norm(V1[i]-V2[j])
        end
    end
    return D
end

function mindist(V1,V2; getIndex=false, skipSelf = false )
    D = Vector{Float64}(undef,length(V1))
    d = Vector{Float64}(undef,length(V2))
    if getIndex
        I = Vector{Int64}(undef,length(V1))
    end
    for i ∈ eachindex(V1)
        for j ∈ eachindex(V2)
            if skipSelf && i==j
                d[j] = Inf
            else
                d[j] = euclidean(V1[i],V2[j]) # norm(V1[i]-V2[j])
            end       
        end
        if getIndex
            D[i], I[i] = findmin(d)
        else
            D[i] = minimum(d)
        end
    end
    if getIndex
        return D, I
    else
        return D
    end
end

I was wondering if there is an even better way of doing the above for my application by using your package more optimally? My dist function creates an $n \times m$ array for a set of $n$ points compared to a set of $m$ points. The second function, mindist creates an $n \times 1$ array where only the lowest distance (and its index if requested) to the set of $m$ points is stored.

Thanks for any suggestions.

dkarrasch commented 5 months ago

For the first goal, you should be able to compute pairwise(Euclidean(), V1, V2). It will treat V1 and V2 as iterables and compute exactly what you need.

For the second goal, there's NearestNeighbors.jl. The good thing is that your type should have length(GeometryBasics.Point{3, Float64}) == 3 (already built in) as requested by that package, so you should be able to create a tree out of V1 directly, and then request the single nearest neighbors for V2.

If you encounter difficulties, perhaps consider posting to the Julia discourse. If everything works just fine, please consider closing this issue.