Open DanielDoehring opened 3 months ago
Introducing a @trixi_timeit timer()
around the callbacks
@trixi_timeit timer() "Step-Callbacks" begin
# handle callbacks
if callbacks isa CallbackSet
foreach(callbacks.discrete_callbacks) do cb
if cb.condition(integrator.u, integrator.t, integrator)
cb.affect!(integrator)
end
return nothing
end
end
end
one obtains for the example mentioned above:
────────────────────────────────────────────────────────────────────────────────────────
Trixi.jl Time Allocations
─────────────────────── ────────────────────────
Tot / % measured: 6.04ms / 97.8% 347KiB / 97.3%
Section ncalls time %tot avg alloc %tot avg
────────────────────────────────────────────────────────────────────────────────────────
main loop 1 5.02ms 85.0% 5.02ms 303KiB 89.7% 303KiB
Step-Callbacks 544 2.77ms 46.8% 5.09μs 294KiB 87.1% 554B
I/O 6 1.61ms 27.2% 268μs 54.3KiB 16.0% 9.04KiB
save solution 6 1.60ms 27.1% 266μs 43.8KiB 13.0% 7.30KiB
~I/O~ 6 7.98μs 0.1% 1.33μs 10.4KiB 3.1% 1.74KiB
get element variables 6 226ns 0.0% 37.7ns 0.00B 0.0% 0.00B
save mesh 6 167ns 0.0% 27.8ns 0.00B 0.0% 0.00B
get node variables 6 111ns 0.0% 18.5ns 0.00B 0.0% 0.00B
~Step-Callbacks~ 544 639μs 10.8% 1.18μs 146KiB 43.1% 275B
analyze solution 6 504μs 8.5% 84.0μs 94.2KiB 27.9% 15.7KiB
calculate dt 544 15.8μs 0.3% 29.0ns 0.00B 0.0% 0.00B
rhs! 2.72k 1.77ms 30.0% 651ns 6.61KiB 2.0% 2.49B
~rhs!~ 2.72k 982μs 16.6% 361ns 6.61KiB 2.0% 2.49B
volume integral 2.72k 174μs 2.9% 64.0ns 0.00B 0.0% 0.00B
source terms 2.72k 114μs 1.9% 41.9ns 0.00B 0.0% 0.00B
boundary flux 2.72k 108μs 1.8% 39.7ns 0.00B 0.0% 0.00B
interface flux 2.72k 82.2μs 1.4% 30.2ns 0.00B 0.0% 0.00B
prolong2interfaces 2.72k 71.7μs 1.2% 26.4ns 0.00B 0.0% 0.00B
prolong2boundaries 2.72k 67.2μs 1.1% 24.7ns 0.00B 0.0% 0.00B
surface integral 2.72k 62.4μs 1.1% 22.9ns 0.00B 0.0% 0.00B
Jacobian 2.72k 61.9μs 1.0% 22.7ns 0.00B 0.0% 0.00B
reset ∂u/∂t 2.72k 48.0μs 0.8% 17.6ns 0.00B 0.0% 0.00B
~main loop~ 1 324μs 5.5% 324μs 2.20KiB 0.7% 2.20KiB
Runge-Kutta step 2.72k 155μs 2.6% 57.1ns 0.00B 0.0% 0.00B
I/O 2 783μs 13.3% 392μs 19.2KiB 5.7% 9.60KiB
~I/O~ 2 466μs 7.9% 233μs 11.9KiB 3.5% 5.95KiB
save solution 1 317μs 5.4% 317μs 7.30KiB 2.2% 7.30KiB
get element variables 1 268ns 0.0% 268ns 0.00B 0.0% 0.00B
save mesh 1 28.0ns 0.0% 28.0ns 0.00B 0.0% 0.00B
get node variables 1 18.0ns 0.0% 18.0ns 0.00B 0.0% 0.00B
analyze solution 1 105μs 1.8% 105μs 15.8KiB 4.7% 15.8KiB
calculate dt 1 302ns 0.0% 302ns 0.00B 0.0% 0.00B
────────────────────────────────────────────────────────────────────────────────────────
Any ideas why we have this for the custom integrators, but not for the ones from OrdinaryDiffEq.jl
@ranocha ?
Can you check with @code_warntype
or Cthulhu.jl whether there are any type instabilities?
Hm, output for
@code_warntype summary_callback.condition(integrator.u, integrator.t, integrator)
@code_warntype summary_callback.affect!(integrator)
is all blue.
But from adding some additional @trixi_timeit timer()
in the callback sections I suppose that there is a type-instability in
cb.condition(integrator.u, integrator.t, integrator)
which we take from OrdinaryDiffEq.jl
, right?
Can you go one level higher with Cthulhu.jl and check where the type instability is introduced?
Hm, unfortunately this gives also no new information compared to @code_warntype
.
We observe allocations for the custom integrators, see
https://github.com/trixi-framework/Trixi.jl/blob/2dfde7faf3cc74f066d86148ae6c99ed9e58fa79/test/test_structured_2d.jl#L762-L774
https://github.com/trixi-framework/Trixi.jl/blob/2dfde7faf3cc74f066d86148ae6c99ed9e58fa79/test/test_tree_2d_advection.jl#L204-L215
and in the current PR #1871
https://github.com/trixi-framework/Trixi.jl/pull/1871/files#diff-4e3f4096536ca3da518b2f02b6f9197bc83ed9a7aec477b329951667dc1f830c
I observe that the allocations seem to be caused by the
SummaryCallback
.For instance, for the example tree_1d_dgsem/elixir_hypdiff_nonperiodic.jl
the alloc check
gives
7520
(as for the PERK2 test).If one supplies the callbacks
instead of
one obtains
0
.