Open Datseris opened 6 years ago
It didn't change anything when I tried the above in the nothing matters branch.
MWE:
julia> @btime muladd(dt,ff,up) setup = ( integ = init(prob1, Euler(), dt=0.1); dt=integ.dt; ff=integ.fsalfirst; up=integ.uprev );
32.395 ns (1 allocation: 80 bytes)
julia> @btime muladd(dt,ff,up) setup = ( dt=rand(); ff=@SMatrix(rand(3,3)); up=@SMatrix(rand(3,3)) );
2.438 ns (0 allocations: 0 bytes)
Even weirder.
julia> @btime muladd(dt,ff,up) setup = ( dt=rand(); ff=@SMatrix(rand(3,3)); up=@SMatrix(rand(3,3)) );
2.539 ns (0 allocations: 0 bytes)
julia> @btime muladd(dt,ff,up) setup = ( dt=rand(); ff=@SMatrix(rand(3,3)); up=@SMatrix(rand(3,3)); integ = init(prob1, Euler(), dt=0.1) );
46.789 ns (4 allocations: 256 bytes)
julia> @btime muladd(dt,ff,up) setup = ( dt=rand(); ff=@SMatrix(rand(3,3)); up=@SMatrix(rand(3,3)); integ = Ref(Vector{Float64}(undef, 5000)));
2.728 ns (0 allocations: 0 bytes)
...
I don't understand... why would muladd
depend on whether you create an integrator or not? In the second case dt, ff, up
are all independent from the integrator. I don't even understand what init
is doing there in the first place :D
It does nothing, it does absolutely nothing to the code that is being benchmarked. Except of course changing the timing.
Also the allocation.
Man I am sorry I always lead you to finding these gems :D
This happened before with random numbers in Optim. Let me find the issue...
@andyferris or @c42f, do you know of a plausible explanation for this?
Don't know right at the moment. The only thing I can say (without digging into it in detail) is that the compiler is very clever at optimizing away microbenchmarks partially or completely, and I expect that's the result of the weird @btime
results.
My "solution" is always to write my own benchmark loop (I don't trust @btime
to sufficiently fool the compiler yet), carefully examine the generated code, and add cheap side effects until the optimizer can no longer remove the parts you want to time.
MWE:
result:
step!
should not allocate.step!
should take an average time of the amounts of timesf
is called inALG
plus a bit more. Let's say that for Tsitf
is called 16 times, then I would not expectstep!
to not take much more than let's say 300 ns (to add around 50-100 ns more for the additions and multiplications with the coefficients).