KristofferC / NearestNeighbors.jl

High performance nearest neighbor data structures (KDTree and BallTree) and algorithms for Julia.
Other
413 stars 65 forks source link

use ArrayOfArrays for return value to reduce the number of allocated arrays #167

Open KristofferC opened 9 months ago

KristofferC commented 9 months ago

A long standing gripe for me has been that indices and distances are returned as standard nested Arrays. Typically, each inner array hold quite a small number of neighbors so it means that we allocate a large number of small arrays.

Using ArrayOfArrays, these are stored contigously in one large flat array instead.

The difference in allocations can be readily seen:

julia> input = rand(3, 10^6);

julia> tree = KDTree(rand(3, 10^6));

julia> @time knn(tree, input, 5);
  1.538003 seconds (2.00 M allocations: 221.253 MiB, 10.03% gc time)

julia> @time knn(tree, input, 5);
  1.489310 seconds (98 allocations: 189.884 MiB, 0.29% gc time)
KristofferC commented 2 months ago

@nanosoldier runtests()

nanosoldier commented 2 months ago

Update on PkgEvalJob KristofferC/NearestNeighbors.jl@048c5a7 vs. KristofferC/NearestNeighbors.jl@bc645d1: Accepted

nanosoldier commented 2 months ago

Update on PkgEvalJob KristofferC/NearestNeighbors.jl@048c5a7 vs. KristofferC/NearestNeighbors.jl@bc645d1: Running

nanosoldier commented 2 months ago

The package evaluation job you requested has completed - possible new issues were detected. The full report is available.