Closed carstenbauer closed 1 day ago
I implemented that option here:
https://github.com/JuliaFolds2/ChunkSplitters.jl/tree/minchunksize
unfortunately, adding one more keyword parameter is breaking the union splitting which we relied on to avoid allocations when creating the chunks.
I've tried many things, without success, and the allocation test keeps failing:
julia> using ChunkSplitters, BenchmarkTools
julia> function f(x; n=nothing, size=nothing)
s = zero(eltype(x))
for inds in chunks(x; n=n, size=size)
for i in inds
s += x[i]
end
end
return s
end
f (generic function with 3 methods)
julia> @benchmark f($(rand(10^3)); n=4) samples=1 evals=1
BenchmarkTools.Trial: 1 sample with 1 evaluation.
Single result which took 514.000 ns (0.00% GC) to evaluate,
with a memory estimate of 32 bytes, over 1 allocations.
FYI, I created a draft PR for your branch: https://github.com/JuliaFolds2/ChunkSplitters.jl/pull/46
FWIW:
Things I tried:
As it is, minchunksize
can be nothing
or an integer. I tried to just let it be an integer always, and ignore it when size
was set. That did not solve the allocation.
In the current PR version I tried to add function barriers all over the place, in the original code we had more conditionals. Nothing changed.
Simpler benchmark:
julia> @benchmark chunks($(rand(10^3)); n=5) samples=1 evals=1
BenchmarkTools.Trial: 1 sample with 1 evaluation.
Single result which took 247.000 ns (0.00% GC) to evaluate,
with a memory estimate of 32 bytes, over 1 allocations.
Ok, seems now it is fixed. Although for some heuristic reason only.
No, unfortunately still, it works on 1.11 but not on 1.10:
On 1.10:
julia> @benchmark chunks($(rand(10^3)); n=5) samples=1 evals=1
BenchmarkTools.Trial: 1 sample with 1 evaluation.
Single result which took 19.000 ns (0.00% GC) to evaluate,
with a memory estimate of 32 bytes, over 1 allocations.
On 1.11:
julia> @benchmark chunks($(rand(10^3)); n=5) samples=1 evals=1
BenchmarkTools.Trial: 1 sample with 1 evaluation.
Single result which took 29.000 ns (0.00% GC) to evaluate,
with a memory estimate of 0 bytes, over 0 allocations.
See my comments in the PR
Released in 2.6.0.
See https://github.com/JuliaFolds2/OhMyThreads.jl/issues/114