Make CI tests a bit less expensive

ranocha commented 3 years ago

For example, here is a list of examples that are relatively expensive on Windows (2D)

examples\2d\elixir_advection_amr.jl, 93.2s
examples\2d\elixir_advection_amr_nonperiodic.jl, 47.3s
examples\2d\elixir_hypdiff_nonperiodic.jl, 38.9s
examples\2d\elixir_euler_shockcapturing.jl, 46.5s
examples\2d\elixir_euler_blast_wave_amr.jl, 62.8s
examples\2d\elixir_euler_sedov_blast_wave.jl, 61.1s
examples\2d\elixir_euler_positivity.jl, 57.6s
examples\2d\elixir_mhd_alfven_wave.jl, 46.6s
examples\2d\elixir_mhd_alfven_wave_mortar.jl, 74.7s
examples\2d\elixir_mhd_orszag_tang.jl, 45.8s
examples\2d\elixir_lbm_lid_driven_cavity.jl, 40.6s
examples\2d\elixir_mhd_rotor.jl, 192s
examples\2d\elixir_mhd_blast_wave.jl, 158s

and Ubuntu (3D)

examples/3d/elixir_advection_mortar.jl, 40.7s
examples/3d/elixir_hypdiff_nonperiodic.jl, 51.9s
examples/3d/elixir_euler_amr.jl, 111s
examples/3d/elixir_euler_shockcapturing.jl, 67.1s
examples/3d/elixir_euler_sedov_blast_wave.jl, 72.0s (although it's using only 5 time steps)
examples/3d/elixir_eulergravity_eoc_test.jl, 44.2s (although it's using only 9 time steps)

sloede commented 3 years ago

Are these times with (pre-)compilation? Because if I run examples\2d\elixir_advection_amr.jl on my laptop, it is finished in <5 s.

ranocha commented 3 years ago

These are the times reported by the summary_callback after each test.

efaulhaber commented 3 years ago

Related to #62

ranocha commented 3 years ago

We might also want to discuss the following questions/options.

Do we need to run all 2D tests on Windows and Mac OS? Would it suffice to run only the MPI and threaded tests?
Can we remove the restart callback save_restart from many elixirs?
Split some expensive test sets into more CI jobs

sloede commented 3 years ago

Do we need to run all 2D tests on Windows and Mac OS? Would it suffice to run only the MPI and threaded tests?

I think we said yes. Sometimes, in the past there have been weird macOS-related issues, and I think we should make sure that we exercise most of the core functionality of Trixi on all relevant platforms. At least as long as it does not become unbearable... If we want to save time during development, what about disabling the macOS and Windows tests on Draft PRs? This way one could have faster turnround times during most of a PR's lifetime, and only get the full checks once we are ready to merge.

Can we remove the restart callback save_restart from many elixirs?

Yes, I have no issue with this. IMHO, we can at least remove that from all but one elixir per dimension-mesh-solver-equation combination.

Split some expensive test sets into more CI jobs

Absolutely. In the past, I have suggested this before, but you (rightfully) warned that due to startup latency, this does not always make it faster.

ranocha commented 3 years ago

Split some expensive test sets into more CI jobs

Absolutely. In the past, I have suggested this before, but you (rightfully) warned that due to startup latency, this does not always make it faster.

Yeah, but we have a bunch of new equation and mesh types so that we can benefit less from re-using compiled code.

ranocha commented 3 years ago

Another (minor) aspect: Documenter is set up to fail when doctests fail, so we don't need to run doctests in https://github.com/trixi-framework/Trixi.jl/blob/ff549a5e67a7685f2ad3c97a0694c756160d79b4/test/test_unit.jl#L515-L517

jlchan commented 3 years ago

The tests really take way too long to run IMO. It's so far been 15 minutes and my tests are still running.

It would be nice to have a minimal set of tests which preserve "enough" code coverage so that any changes to Trixi base could be more quickly checked on a local machine.

jlchan commented 3 years ago

One possibility would be to create a testset intended for "local" testing, which could exclude some of the CI tests.

ranocha commented 3 years ago

That's definitely a good point. What I usually do when modifying Trixi is to include only a subset of tests locally, say test/test_examples_2d_advection.jl when I modified some 2D stuff. That's usually a good smoke test. However, it's a bit hard to cover (nearly) everything in a cheap test set using the current way of testing, I fear.

efaulhaber commented 3 years ago

Has anything significantly changed the timing since you reported them, @ranocha? I noticed that 3d/elixir_euler_amr.jl takes over 300s now in GitHub (and for some reason over 400s on my system, maybe that's because of Windows?). 2D and 3D tests regularly take over an hour now. Is it really necessary to let the simulation run that long? Would it be sufficient to let tests like this run to t=1 instead of t=10 (maybe use a different start time to still test that the blob is running over the periodic boundaries)?

ranocha commented 3 years ago

Yeas, something like that is definitely a good option from my point of view. A major impact on the CI run time was our more extensive use of Polyester, StrideArrays, and LoopVectorization. This combination is really good for runtime performance, but particularly demanding for CI when collecting coverage results.

sloede commented 3 years ago

Finding good tests is always tricky. "As short as possible but as long as necessary" is our yardstick, but what exactly the latter part means in practice is often hard to tell.

I fully agree that we need to reduce the amount of time it takes for testing, but from past experience (especially from tests that didn't run long enough to uncover errors that were only found much later), I feel like this is in general a non-trivial task and requires some thinking and selective tweaking. This is also the only reason this hasn't been tackled yet - a lack of developer time :-/

ranocha commented 3 years ago

We are experiencing some problems with GitHub actions in the last few days - jobs are stuck at the queued stage although we have enough free capacity. One way to reduce problems like these could be to reduce the number of tests that need to run on all three OS. From my point of view, it should be sufficient to have some basic tests on all OS (including threads, MPI, p4est, and other binary dependencies), but we definitely do not need to test every 2D setup on Windows and Mac OS.

sloede commented 2 years ago

We are experiencing some problems with GitHub actions in the last few days - jobs are stuck at the queued stage although we have enough free capacity. One way to reduce problems like these could be to reduce the number of tests that need to run on all three OS. From my point of view, it should be sufficient to have some basic tests on all OS (including threads, MPI, p4est, and other binary dependencies), but we definitely do not need to test every 2D setup on Windows and Mac OS.

IIRC, this hs been resolved by your efforts this year, hasn't it @ranocha?

ranocha commented 2 years ago

This particular problem, yes. However, I think CI is still too expensive

trixi-framework / Trixi.jl

Make CI tests a bit less expensive #372