trixi-framework / Trixi.jl

Trixi.jl: Adaptive high-order numerical simulations of conservation laws in Julia
https://trixi-framework.github.io/Trixi.jl
MIT License
505 stars 98 forks source link

Make `copy_to_coupled_boundary!` threaded #1981

Closed efaulhaber closed 2 weeks ago

efaulhaber commented 2 weeks ago

When running a simulation with 100k DOFs on my laptop with 6 threads (the difference will be much bigger on more threads): Before (with #1978 and #1979):

────────────────────────────────────────────────────────────────────────────────────────────────────
Trixi.jl simulation finished.  Final time: 1.0368e6  Time steps: 2736 (accepted), 2736 (total)
────────────────────────────────────────────────────────────────────────────────────────────────────

 ───────────────────────────────────────────────────────────────────────────────────────
               Trixi.jl                        Time                    Allocations      
                                      ───────────────────────   ────────────────────────
           Tot / % measured:               35.9s /  63.7%           2.19GiB /  99.8%    

 Section                      ncalls     time    %tot     avg     alloc    %tot      avg
 ───────────────────────────────────────────────────────────────────────────────────────
 copy to coupled boundaries    13.7k    13.9s   60.6%  1.01ms   1.91GiB   87.7%   147KiB
 rhs!                          82.1k    7.87s   34.4%  95.8μs   89.3MiB    4.0%  1.11KiB
   volume integral             82.1k    3.67s   16.0%  44.7μs   22.6MiB    1.0%     288B
   interface flux              82.1k    1.51s    6.6%  18.4μs   26.5MiB    1.2%     339B
   surface integral            82.1k    826ms    3.6%  10.1μs   21.3MiB    1.0%     272B
   reset ∂u/∂t                 82.1k    728ms    3.2%  8.87μs   6.36KiB    0.0%    0.08B
   boundary flux               82.1k    643ms    2.8%  7.83μs     0.00B    0.0%    0.00B
   Jacobian                    82.1k    423ms    1.8%  5.15μs   18.9MiB    0.8%     241B
   ~rhs!~                      82.1k   65.8ms    0.3%   801ns   5.14KiB    0.0%    0.06B
   source terms                82.1k   1.04ms    0.0%  12.6ns     0.00B    0.0%    0.00B
 calculate dt                  2.74k    658ms    2.9%   241μs   34.5MiB    1.5%  12.9KiB
 I/O                              50    482ms    2.1%  9.65ms    151MiB    6.8%  3.03MiB
   save solution                 294    480ms    2.1%  1.63ms    150MiB    6.7%   522KiB
   ~I/O~                          50   1.92ms    0.0%  38.3μs    928KiB    0.0%  18.6KiB
   save mesh                      49    139μs    0.0%  2.84μs    602KiB    0.0%  12.3KiB
   get element variables         294   92.8μs    0.0%   316ns     0.00B    0.0%    0.00B
   get node variables            294   5.25μs    0.0%  17.8ns     0.00B    0.0%    0.00B
 ───────────────────────────────────────────────────────────────────────────────────────

This PR (with #1978 and #1979):

────────────────────────────────────────────────────────────────────────────────────────────────────
Trixi.jl simulation finished.  Final time: 1.0368e6  Time steps: 2736 (accepted), 2736 (total)
────────────────────────────────────────────────────────────────────────────────────────────────────

 ───────────────────────────────────────────────────────────────────────────────────────
               Trixi.jl                        Time                    Allocations      
                                      ───────────────────────   ────────────────────────
           Tot / % measured:               26.4s /  42.6%           2.31GiB /  99.9%    

 Section                      ncalls     time    %tot     avg     alloc    %tot      avg
 ───────────────────────────────────────────────────────────────────────────────────────
 rhs!                          82.1k    5.83s   51.8%  71.0μs   88.9MiB    3.8%  1.11KiB
   volume integral             82.1k    2.83s   25.2%  34.5μs   22.5MiB    1.0%     288B
   interface flux              82.1k    1.04s    9.2%  12.6μs   26.3MiB    1.1%     336B
   boundary flux               82.1k    595ms    5.3%  7.25μs     0.00B    0.0%    0.00B
   surface integral            82.1k    551ms    4.9%  6.71μs   21.3MiB    0.9%     272B
   reset ∂u/∂t                 82.1k    455ms    4.0%  5.54μs     0.00B    0.0%    0.00B
   Jacobian                    82.1k    293ms    2.6%  3.57μs   18.8MiB    0.8%     240B
   ~rhs!~                      82.1k   59.7ms    0.5%   727ns   5.14KiB    0.0%    0.06B
   source terms                82.1k   1.03ms    0.0%  12.5ns     0.00B    0.0%    0.00B
 copy to coupled boundaries    13.7k    4.32s   38.4%   316μs   2.04GiB   88.4%   156KiB
 calculate dt                  2.74k    713ms    6.3%   261μs   34.5MiB    1.5%  12.9KiB
 I/O                              50    395ms    3.5%  7.90ms    151MiB    6.4%  3.03MiB
   save solution                 294    393ms    3.5%  1.34ms    150MiB    6.3%   522KiB
   ~I/O~                          50   1.86ms    0.0%  37.2μs    928KiB    0.0%  18.6KiB
   save mesh                      49    137μs    0.0%  2.79μs    602KiB    0.0%  12.3KiB
   get element variables         294   96.0μs    0.0%   327ns     0.00B    0.0%    0.00B
   get node variables            294   8.92μs    0.0%  30.3ns     0.00B    0.0%    0.00B
 ───────────────────────────────────────────────────────────────────────────────────────
github-actions[bot] commented 2 weeks ago

Review checklist

This checklist is meant to assist creators of PRs (to let them know what reviewers will typically look for) and reviewers (to guide them in a structured review process). Items do not need to be checked explicitly for a PR to be eligible for merging.

Purpose and scope

Code quality

Documentation

Testing

Performance

Verification

Created with :heart: by the Trixi.jl community.

codecov[bot] commented 2 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 96.16%. Comparing base (5398b22) to head (58707b1).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1981 +/- ## ========================================== - Coverage 96.16% 96.16% -0.00% ========================================== Files 460 460 Lines 36958 36958 ========================================== - Hits 35539 35538 -1 - Misses 1419 1420 +1 ``` | [Flag](https://app.codecov.io/gh/trixi-framework/Trixi.jl/pull/1981/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=trixi-framework) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/trixi-framework/Trixi.jl/pull/1981/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=trixi-framework) | `96.16% <100.00%> (-<0.01%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=trixi-framework#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.