JuliaData / SplitApplyCombine.jl

Split-apply-combine strategies for Julia
Other
144 stars 15 forks source link

reduce combinedims overhead #55

Closed aplavin closed 1 year ago

andyferris commented 1 year ago

(Also sorry for the lack of attention - I've been overseas, had Covid in the house, general chaos).

aplavin commented 1 year ago

There's significant overhead associated with : handling, even in the simplest cases:

julia> @btime (A[1, 2, :] = $([5])) setup=(A=rand(2, 3))
  13.988 ns (0 allocations: 0 bytes)

julia> @btime (view(A, 1, 2) .= $([5])) setup=(A=rand(2, 3))
  7.006 ns (0 allocations: 0 bytes)

This line in combinedims just stood out in profiling my code once, and explicit view performs better. I don't really have a deeper explanation of why exactly this difference is present.

Btw, unrelated to this PR - an eager version of combinedims may soon appear in Base: https://github.com/JuliaLang/julia/pull/43334.