KristofferC / TimerOutputs.jl

Formatted output of timed sections in Julia
Other
659 stars 54 forks source link

Allow custom merge points #128

Closed IanButterworth closed 3 years ago

IanButterworth commented 3 years ago

One way to handle multithreading seems to be to use local TimerOutput objects, but currently merge! just globally merges. This allows merging into a custom point in the destination to.

using TimerOutputs

to = TimerOutput()

@timeit to "1" begin
    @timeit to "1.1" sleep(0.1)
    @timeit to "1.2" sleep(0.1)
    @timeit to "1.3" sleep(0.1)
end

@timeit to "2" Threads.@spawn begin
    to2 = TimerOutput()
    @timeit to2 "2.1" sleep(0.1)
    @timeit to2 "2.2" sleep(0.1)
    @timeit to2 "2.3" sleep(0.1)
    merge!(to, to2, tree_point = ["2"])
end
julia> to
 ──────────────────────────────────────────────────────────────────
                           Time                   Allocations      
                   ──────────────────────   ───────────────────────
 Tot / % measured:      3.23s / 9.79%           13.5MiB / 36.9%    
 Section   ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────
 1              1    309ms  98.0%   309ms   4.55MiB  91.5%  4.55MiB
   1.3          1    106ms  33.6%   106ms      320B  0.01%     320B
   1.2          1    102ms  32.3%   102ms      320B  0.01%     320B
   1.1          1    101ms  32.0%   101ms   4.54MiB  91.4%  4.54MiB
 2              1   6.47ms  2.05%  6.47ms    435KiB  8.54%   435KiB
   2.2          1    106ms  33.6%   106ms      480B  0.01%     480B
   2.3          1    105ms  33.4%   105ms      144B  0.00%     144B
   2.1          1    103ms  32.5%   103ms   5.03MiB  101%   5.03MiB
 ──────────────────────────────────────────────────────────────────

otherwise, currently a merge!(to, to2) would result in

julia> to
 ──────────────────────────────────────────────────────────────────
                           Time                   Allocations      
                   ──────────────────────   ───────────────────────
 Tot / % measured:      1.86s / 33.8%           1.61MiB / 0.65%    

 Section   ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────
 1              1    315ms  50.2%   315ms   4.02KiB  37.2%  4.02KiB
   1.2          1    106ms  16.9%   106ms      320B  2.90%     320B
   1.1          1    105ms  16.8%   105ms      320B  2.90%     320B
   1.3          1    103ms  16.5%   103ms      320B  2.90%     320B
 2.1            1    105ms  16.8%   105ms      144B  1.30%     144B
 2.2            1    105ms  16.8%   105ms      144B  1.30%     144B
 2.3            1    102ms  16.2%   102ms      144B  1.30%     144B
 2              1    134μs  0.02%   134μs   6.35KiB  58.9%  6.35KiB
 ──────────────────────────────────────────────────────────────────

I can add tests if this is welcomed

Octogonapus commented 3 years ago

@IanButterworth This is great, thanks for suggesting this!

IanButterworth commented 3 years ago

Not sure why they're not showing up here still.. but the actions tests are passing on my branch https://github.com/IanButterworth/TimerOutputs.jl/runs/2906281500

This is the table for the additional test

 ────────────────────────────────────────────────────────────────────
                             Time                   Allocations      
                     ──────────────────────   ───────────────────────
  Tot / % measured:       845ms / 38.6%           9.47MiB / 4.64%    

 Section     ncalls     time   %tot     avg     alloc   %tot      avg
 ────────────────────────────────────────────────────────────────────
 1                1    315ms  96.6%   315ms   3.03KiB  0.67%  3.03KiB
   1.1            1    105ms  32.3%   105ms      176B  0.04%     176B
   1.3            1    105ms  32.2%   105ms      176B  0.04%     176B
   1.2            1    105ms  32.0%   105ms      176B  0.04%     176B
 2                1   11.0ms  3.37%  11.0ms    441KiB  97.8%   441KiB
   2.1            1    105ms  32.3%   105ms   4.85MiB  1102%  4.85MiB
   2.3            1    103ms  31.7%   103ms   2.17KiB  0.48%  2.17KiB
   2.2            1    102ms  31.2%   102ms   3.00KiB  0.67%  3.00KiB
 3                1   41.2μs  0.01%  41.2μs   6.69KiB  1.49%  6.69KiB
   3.1            1   32.8μs  0.01%  32.8μs      464B  0.10%     464B
     3.1.1        1    105ms  32.2%   105ms   1.17MiB  266%   1.17MiB
     3.1.2        1    103ms  31.6%   103ms   2.17KiB  0.48%  2.17KiB
     3.1.3        1    101ms  30.9%   101ms   5.77KiB  1.28%  5.77KiB
   3.2            1   6.38μs  0.00%  6.38μs      416B  0.09%     416B
     3.2.2        1    104ms  31.9%   104ms   5.44KiB  1.21%  5.44KiB
     3.2.1        1    104ms  31.9%   104ms   2.17KiB  0.48%  2.17KiB
     3.2.3        1    104ms  31.8%   104ms   4.77KiB  1.06%  4.77KiB
 ────────────────────────────────────────────────────────────────────
IanButterworth commented 3 years ago

I'm trying to add the accumulated data within a thread into the parents but it's not quite working.

Perhaps this is ok, as "2" did only take 11ms (mostly compilation I assume) to return given it spawns, even though the child processes are ~300ms.

So maybe it doesn't need fixing?

This is after the latest commits

julia> to
 ────────────────────────────────────────────────────────────────────
                             Time                   Allocations      
                     ──────────────────────   ───────────────────────
  Tot / % measured:       7.89s / 4.30%           78.5MiB / 1.45%    

 Section     ncalls     time   %tot     avg     alloc   %tot      avg
 ────────────────────────────────────────────────────────────────────
 1                1    329ms  97.1%   329ms    723KiB  61.8%   723KiB
   1.1            1    112ms  33.0%   112ms   8.58KiB  0.73%  8.58KiB
   1.3            1    105ms  31.0%   105ms      320B  0.03%     320B
   1.2            1    105ms  31.0%   105ms      320B  0.03%     320B
 2                1   9.93ms  2.92%  9.93ms    439KiB  37.6%   439KiB
   2.2            1    103ms  30.5%   103ms   1.12KiB  0.10%  1.12KiB
   2.3            1    102ms  30.1%   102ms   1.25KiB  0.11%  1.25KiB
   2.1            1    101ms  29.8%   101ms   5.71MiB  500%   5.71MiB
 3                1   29.8μs  0.01%  29.8μs   6.75KiB  0.58%  6.75KiB
   3.1            1   17.1μs  0.01%  17.1μs      416B  0.03%     416B
     3.1.2        1    102ms  29.9%   102ms   2.08KiB  0.18%  2.08KiB
     3.1.3        1    101ms  29.8%   101ms   12.3MiB  1076%  12.3MiB
     3.1.1        1    101ms  29.7%   101ms   1.50MiB  132%   1.50MiB
   3.2            1   2.55μs  0.00%  2.55μs      368B  0.03%     368B
     3.2.3        1    105ms  31.0%   105ms   15.6MiB  1365%  15.6MiB
     3.2.1        1    105ms  30.9%   105ms   2.09KiB  0.18%  2.09KiB
     3.2.2        1    102ms  30.1%   102ms   1.25KiB  0.11%  1.25KiB
 ────────────────────────────────────────────────────────────────────