KristofferC / TimerOutputs.jl

Formatted output of timed sections in Julia
Other
642 stars 53 forks source link

Additional allocations in Julia v1.6 compared to Julia v1.5 #105

Open ranocha opened 3 years ago

ranocha commented 3 years ago

Thanks for this nice package! We use it in our hyperbolic PDE solver Trixi.jl and are really happy to be able to do so. When testing the release candidate of Julia v1.6, I noticed that we get additional allocations compared to Julia v1.5.3, which seem to be caused by TimerOutputs.jl.

On Julia v1.5.3, I get

julia15 --threads=1 -e 'using Pkg; Pkg.activate(temp=true); Pkg.add("Trixi"); Pkg.add("OrdinaryDiffEq"); using Trixi; trixi_include(joinpath(examples_dir(), "2d", "elixir_hypdiff_nonperiodic.jl"))'
[...]
 -------------------------------------------------------------------------------
            Trixi.jl                    Time                   Allocations      
                                ----------------------   -----------------------
        Tot / % measured:            4.08s / 71.7%            181MiB / 30.3%    

 Section                ncalls     time   %tot     avg     alloc   %tot      avg
 -------------------------------------------------------------------------------
 rhs!                    12.6k    1.23s  42.2%  98.2μs   2.70MiB  4.91%     225B
   source terms          12.6k    436ms  14.9%  34.7μs   1.34MiB  2.45%     112B
   volume integral       12.6k    427ms  14.6%  34.0μs     0.00B  0.00%    0.00B
   surface integral      12.6k    104ms  3.55%  8.26μs     0.00B  0.00%    0.00B
   prolong2interfaces    12.6k   85.1ms  2.91%  6.77μs     0.00B  0.00%    0.00B
   interface flux        12.6k   71.2ms  2.43%  5.66μs     0.00B  0.00%    0.00B
   Jacobian              12.6k   45.7ms  1.56%  3.63μs     0.00B  0.00%    0.00B
   boundary flux         12.6k   32.5ms  1.11%  2.59μs     0.00B  0.00%    0.00B
   reset ∂u/∂t           12.6k   9.22ms  0.32%   734ns     0.00B  0.00%    0.00B
   prolong2boundaries    12.6k   5.31ms  0.18%   422ns     0.00B  0.00%    0.00B
   prolong2mortars       12.6k    856μs  0.03%  68.1ns     0.00B  0.00%    0.00B
   mortar flux           12.6k    649μs  0.02%  51.6ns     0.00B  0.00%    0.00B
 analyze solution           27    1.07s  36.5%  39.5ms   20.9MiB  38.2%   795KiB
 I/O                        28    626ms  21.4%  22.3ms   31.3MiB  56.9%  1.12MiB
 calculate dt            2.52k    520μs  0.02%   207ns     0.00B  0.00%    0.00B
 -------------------------------------------------------------------------------

On Julia v1.6.0-rc1, I get

julia16 --threads=1 -e 'using Pkg; Pkg.activate(temp=true); Pkg.add("Trixi"); Pkg.add("OrdinaryDiffEq"); using Trixi; trixi_include(joinpath(examples_dir(), "2d", "elixir_hypdiff_nonperiodic.jl"))'
[...]
 -------------------------------------------------------------------------------
            Trixi.jl                    Time                   Allocations      
                                ----------------------   -----------------------
        Tot / % measured:            9.02s / 73.4%            562MiB / 32.6%    

 Section                ncalls     time   %tot     avg     alloc   %tot      avg
 -------------------------------------------------------------------------------
 analyze solution           27    4.48s  67.6%   166ms   75.6MiB  41.2%  2.80MiB
 rhs!                    12.6k    1.36s  20.6%   108μs   86.7MiB  47.3%  7.06KiB
   source terms          12.6k    460ms  6.95%  36.6μs   25.7MiB  14.0%  2.09KiB
   volume integral       12.6k    443ms  6.68%  35.2μs     0.00B  0.00%    0.00B
   surface integral      12.6k    108ms  1.63%  8.58μs     0.00B  0.00%    0.00B
   prolong2interfaces    12.6k   96.7ms  1.46%  7.70μs     0.00B  0.00%    0.00B
   interface flux        12.6k   76.5ms  1.16%  6.09μs     0.00B  0.00%    0.00B
   Jacobian              12.6k   51.9ms  0.78%  4.13μs     0.00B  0.00%    0.00B
   boundary flux         12.6k   43.3ms  0.65%  3.44μs     0.00B  0.00%    0.00B
   prolong2mortars       12.6k   20.8ms  0.31%  1.65μs   35.3MiB  19.2%  2.88KiB
   reset ∂u/∂t           12.6k   13.5ms  0.20%  1.08μs     0.00B  0.00%    0.00B
   prolong2boundaries    12.6k   12.2ms  0.18%   970ns     0.00B  0.00%    0.00B
   mortar flux           12.6k   7.66ms  0.12%   609ns     0.00B  0.00%    0.00B
 I/O                        28    784ms  11.8%  28.0ms   21.0MiB  11.5%   770KiB
 calculate dt            2.52k    704μs  0.01%   280ns     0.00B  0.00%    0.00B
 -------------------------------------------------------------------------------

If I use Revise.jl and modify the parts where @timeit_debug is used, e.g. by replacing them by @timeit, much (if not all) of the allocation overhead in Julia v1.6 is gone. However, if I replace @timeit_debug by @timeit and start Julia v1.6 from scratch, I get again the same allocations (when having Trixi.jl in development mode instead of adding a registered version as in the MWE above).

KristofferC commented 3 years ago

The implementation of @timeit_debug is not great and has some performance problems. However, I did not know that @timeit has any such problems. Do you have a smaller example where @timeit gives extra allocations on 1.6 vs 1.5?

ranocha commented 3 years ago

I'm sorry, I've been trying to create a smaller example but haven't been successful so far.

KristofferC commented 3 years ago

This example doesn' run, I get:

ERROR: LoadError: ArgumentError: Package OrdinaryDiffEq not found in current path:
- Run `import Pkg; Pkg.add("OrdinaryDiffEq")` to install the OrdinaryDiffEq package.
ranocha commented 3 years ago

Sorry, I added Pkg.add("OrdinaryDiffEq") above - it's in my default environment so I didn't notice the error. It should run now as it is.

ranocha commented 3 years ago

A reason for allocations was the use of a try ... catch ... block in TimerOutputs. I implemented our own reduced version of @timeit_debug without that in https://github.com/trixi-framework/Trixi.jl/pull/509 and could fix some type instabilities and allocations coming from @timeit_debug. Maybe it could be an option to @timeit and @timeit_debug to avoid this try block as well?

KristofferC commented 3 years ago

There could be a @timeit_nothrow.