trixi-framework / Trixi.jl

Trixi.jl: Adaptive high-order numerical simulations of conservation laws in Julia
https://trixi-framework.github.io/Trixi.jl
MIT License
508 stars 98 forks source link

Switch to error-based step size control as default in elixirs #606

Open ranocha opened 3 years ago

ranocha commented 3 years ago

The Runge-Kutta methods with automatic step size control optimized for compressible fluid dynamics Lisandro Dalcin, Matteo Parsani, David Ketcheson, and I published a few weeks ago in a preprint on arXiv are now available in the Julia differential equations ecosystem.

julia> using OrdinaryDiffEq, Trixi
julia> trixi_include("examples/2d/elixir_euler_vortex.jl") # isentropic vortex standard setup using CFL-based step size control for the 2N method of Carpenter and Kennedy
[...]
 Variable:       rho              rho_v1           rho_v2           rho_e         
 L2 error:       3.63431418e-06   3.21113799e-03   3.21114828e-03   4.54571590e-03
[...]
 ──────────────────────────────────────────────────────────────────────────────────
              Trixi.jl                     Time                   Allocations      
                                   ──────────────────────   ───────────────────────
         Tot / % measured:              575ms / 94.0%           4.91MiB / 95.5%    
 Section                   ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────────────────────
 rhs!                       2.78k    455ms  84.2%   164μs    966KiB  20.1%     356B
   volume integral          2.78k    167ms  30.9%  60.0μs     0.00B  0.00%    0.00B
   interface flux           2.78k    135ms  24.9%  48.4μs     0.00B  0.00%    0.00B
   surface integral         2.78k   66.5ms  12.3%  23.9μs     0.00B  0.00%    0.00B
   prolong2interfaces       2.78k   65.3ms  12.1%  23.5μs     0.00B  0.00%    0.00B
   Jacobian                 2.78k   10.1ms  1.86%  3.62μs     0.00B  0.00%    0.00B
   reset ∂u/∂t              2.78k   6.82ms  1.26%  2.45μs     0.00B  0.00%    0.00B
   ~rhs!~                   2.78k   4.28ms  0.79%  1.54μs    966KiB  20.1%     356B
   prolong2boundaries       2.78k    143μs  0.03%  51.5ns     0.00B  0.00%    0.00B
   prolong2mortars          2.78k    107μs  0.02%  38.3ns     0.00B  0.00%    0.00B
   mortar flux              2.78k   72.4μs  0.01%  26.0ns     0.00B  0.00%    0.00B
   boundary flux            2.78k   53.3μs  0.01%  19.2ns     0.00B  0.00%    0.00B
   source terms             2.78k   46.1μs  0.01%  16.6ns     0.00B  0.00%    0.00B
 I/O                           15   59.9ms  11.1%  3.99ms   3.57MiB  76.1%   244KiB
   save solution                7   55.1ms  10.2%  7.87ms   2.74MiB  58.3%   400KiB
   ~I/O~                       15   4.75ms  0.88%   317μs    835KiB  17.4%  55.7KiB
   get element variables        7   49.6μs  0.01%  7.09μs   17.9KiB  0.37%  2.56KiB
   save mesh                    7    936ns  0.00%   134ns     0.00B  0.00%    0.00B
 calculate dt                 557   16.4ms  3.04%  29.5μs     0.00B  0.00%    0.00B
 analyze solution               7   9.31ms  1.72%  1.33ms    181KiB  3.77%  25.9KiB
 ──────────────────────────────────────────────────────────────────────────────────
julia> sol = solve(ode, RDPK3SpFSAL49(), abstol=1.0e-6, reltol=1.0e-6,
                   save_everystep=false, callback=callbacks); summary_callback() # one of our methods
[...]
 Variable:       rho              rho_v1           rho_v2           rho_e         
 L2 error:       3.64722648e-06   3.21113022e-03   3.21114614e-03   4.54571112e-03
[...]
 ──────────────────────────────────────────────────────────────────────────────────
              Trixi.jl                     Time                   Allocations      
                                   ──────────────────────   ───────────────────────
         Tot / % measured:              364ms / 89.4%           3.23MiB / 83.1%    
 Section                   ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────────────────────
 rhs!                       1.90k    314ms  96.5%   165μs    664KiB  24.1%     358B
   volume integral          1.90k    116ms  35.6%  60.9μs     0.00B  0.00%    0.00B
   interface flux           1.90k   93.1ms  28.6%  48.9μs     0.00B  0.00%    0.00B
   surface integral         1.90k   45.7ms  14.0%  24.0μs     0.00B  0.00%    0.00B
   prolong2interfaces       1.90k   45.1ms  13.8%  23.7μs     0.00B  0.00%    0.00B
   Jacobian                 1.90k   7.10ms  2.18%  3.73μs     0.00B  0.00%    0.00B
   reset ∂u/∂t              1.90k   4.08ms  1.25%  2.14μs     0.00B  0.00%    0.00B
   ~rhs!~                   1.90k   2.83ms  0.87%  1.49μs    664KiB  24.1%     358B
   prolong2boundaries       1.90k   86.8μs  0.03%  45.7ns     0.00B  0.00%    0.00B
   prolong2mortars          1.90k   67.8μs  0.02%  35.6ns     0.00B  0.00%    0.00B
   mortar flux              1.90k   53.2μs  0.02%  28.0ns     0.00B  0.00%    0.00B
   boundary flux            1.90k   39.5μs  0.01%  20.7ns     0.00B  0.00%    0.00B
   source terms             1.90k   31.2μs  0.01%  16.4ns     0.00B  0.00%    0.00B
 analyze solution               4   5.72ms  1.76%  1.43ms    104KiB  3.76%  25.9KiB
 I/O                            9   5.70ms  1.75%   634μs   1.94MiB  72.1%   220KiB
   save solution                4   3.33ms  1.02%   833μs   1.53MiB  56.9%   392KiB
   ~I/O~                        9   2.35ms  0.72%   261μs    408KiB  14.8%  45.3KiB
   get element variables        4   27.1μs  0.01%  6.79μs   10.2KiB  0.37%  2.56KiB
   save mesh                    4    392ns  0.00%  98.0ns     0.00B  0.00%    0.00B
 ──────────────────────────────────────────────────────────────────────────────────

As discussed on Slack, it would be nice to use these improvements by switching our default time integration methods.

However, we should discuss whether we consider this to be a breaking change. I don't think so since it's not documented anywhere. On the other hand, people might argue that their code could behaves differently when they just trixi_include some elixir distributed with Trixi.

andrewwinters5000 commented 3 years ago

I noticed using SSPRK43 sometimes the tests would pass locally but fail on github because the error differences would be on the order of 1e-8. This was many version ago of the unstructured solver, but it might be something to keep in mind.

ranocha commented 3 years ago

Well, it's reasonable to expect differences like these I guess. Thus, we will probably need to adapt tests, too, if we make this switch

sloede commented 3 years ago

Intuitively, I would consider it a breaking change, at least for default_example(). Also, I think it would be good to get some comparisons of the different schemes available with our current examples to get a better idea of what the "best" scheme with "optimal" parameters is.

In case we figure out a good parameter set for the tolerances, is there any chance we can get these as new defaults into OrdinaryDiffeq.jl :grimacing: Otherwise we should also consider what we consider acceptable errors in comparison to a standard, high-order time integration scheme, and if the default parameters fit our our bill. I'd hate to see replacing one parameter (CFL) by another (tolerances).

Overall, I am all for this change, but at the moment feel like it would require some more preparatory work (thus also the suggestion for a joint thesis project).

ranocha commented 3 years ago

Sure, this will need some work. I just wanted to start this discussion on GitHub such that it doesn't get lost in Slack.

is there any chance we can get these as new defaults into OrdinaryDiffeq.jl

I don't think so - at least I would be quite surprised.

"best" scheme with "optimal" parameters

Do we really want to have that? Doesn't it suffice to have a reasonably good scheme? I'm sure many of the scheme/CFL parameters we have right now are far from "optimal".

I'd hate to see replacing one parameter (CFL) by another (tolerances)

At least the sensitivity will change dramatically (from linear to roughly logarithmic unless we're in the stability-limited regime and the tolerances are not too coarse).

sloede commented 3 years ago

Sure, this will need some work. I just wanted to start this discussion on GitHub such that it doesn't get lost in Slack.

:+1:

"best" scheme with "optimal" parameters

Do we really want to have that? Doesn't it suffice to have a reasonably good scheme? I'm sure many of the scheme/CFL parameters we have right now are far from "optimal".

Yes, but that "reasonably" is key for me. I have very little experience with these schemes (unlike you), so I'd feel more comfortable to see a few more numbers for Trixi and the typical problems we have been looking at so far. Especially, when we have a mixture of error-limited and stability-limited setups.

I'd hate to see replacing one parameter (CFL) by another (tolerances)

At least the sensitivity will change dramatically (from linear to roughly logarithmic unless we're in the stability-limited regime and the tolerances are not too coarse).

I agree, this is one of the great benefits I am looking forward to :+1:

andrewwinters5000 commented 3 years ago

I used the RDPK3SpFSAL49 time integrator mentioned above in the new APE unstructured test in #615 . There were no issues with tolerance checks or anything. (It is also very fast!)

gregorgassner commented 3 years ago

I agree that these new methods work like a charm. I think that these methods are very well suited for applications and production runs. And I would support the idea that our elixirs with "applications" will have this method as a standard.

However, I am not sure if we should change all elixirs to this type of time integration. Here is my reasoning: (i) CFL based time stepping is the current standard and what people are used to/know about (ii) When implementing/trying out/debugging/testing I personally like to have full control over the time stepping in the sense that with the choice of the CFL number I can directly play around with time step sizes etc. This often comes in very handy so I would argue that our "verification" elixirs keep the CFL based time stepping (eoc, ec, ...)

jlchan commented 3 years ago

When implementing/trying out/debugging/testing I personally like to have full control over the time stepping in the sense that with the choice of the CFL number I can directly play around with time step sizes etc. This often comes in very handy so I would argue that our "verification" elixirs keep the CFL based time stepping (eoc, ec, ...)

Same here, but could we also do this by defaulting to adaptive=false and specifying dt?

ranocha commented 3 years ago

Yes, you can set solve(ode, ..., adaptive=false) and set dt or add the step size callback. The new RK methods will still work with these arguments.

gregorgassner commented 3 years ago

That is a good idea. I think it would be good to have some elixirs (maybe always the eoc ones that are available anyway for each PDE) with explicit time step callback and CFL. Then I do not mind having all others with the adatpive time stepping. If we have a "rule" that all eoc files use the dt calculation, it would also be easy to find for the PDE when needed...