KristofferC / NearestNeighbors.jl

High performance nearest neighbor data structures (KDTree and BallTree) and algorithms for Julia.
Other
413 stars 65 forks source link

KDTree: SoA the `Node` struct to avoid paying alignment cost to store the split dimension #177

Closed KristofferC closed 2 months ago

KristofferC commented 2 months ago

Previously, the split_dim had to be aligned giving:

julia> sizeof(NearestNeighbors.KDNode{Float64})
16

This instead SoAs that struct so that the values and dimensions are stored separately and can thus be stored without paying the alignment cost:

julia> r = rand(3, 10^6) ;

julia> tree = KDTree(r);

# Master
julia> sizeof(tree.nodes)
1599984

# PR
julia> sizeof(tree.split_dims) + sizeof(tree.split_vals)
899991

I see roughly a 5% improvement in knn from this.