COSIMA / access-om3

ACCESS-OM3 global ocean-sea ice-wave coupled model
13 stars 6 forks source link

Performance testing with generic WOMBAT #155

Open dougiesquire opened 1 month ago

dougiesquire commented 1 month ago

We should check the performance impact of running with WOMBAT and how it compares to ACCESS-OM2. Of course this will depend on the tracer timestep used in MOM and so is related to #138.

Also, it would be nice to have a single executable for the BGC and non-BGC configurations. To use generic tracers requires that MOM6 _USE_GENERIC_TRACER definition is set. We should check that setting this doesn't impact performance when no generic tracers are configured.

dougiesquire commented 1 month ago

I've run a few tests to answer the question of whether or not setting _USE_GENERIC_TRACER impacts performance. I ran payu run -n 4 using the current MOM6-CICE6 1deg_jra55do_ryf configuration with executables built in three different ways:

  1. from current access-om3 main (@ 5373389)
  2. from #90 (@ 99f7cf2)
  3. from #90 (@ 99f7cf2) but without setting the _USE_GENERIC_TRACER and _USE_MOM6_DIAG compiler definitions

The experiments were run at the same time.

The results suggest that there is no performance cost from #90 when there are no generic tracers configured. Walltimes below are taken from the PAYU_WALLTIME field in the job.yaml output file.

Includes #90 _USE_GENERIC_TRACER output000 output001 output002 output003 average
No No 1072.25 s 1000.64 s 1103.94 s 1067.87 s 1061.17 s
Yes Yes 1095.05 s 1004.90 s 1105.35 s 1072.82 s 1069.53 s
Yes No 1080.87 s 995.24 s 1095.75 s 1041.77 s 1053.41 s
dougiesquire commented 1 month ago

I forgot to also mention that the ocean.stats files are identical across the different runs above. This is expected but still worth checking.

dougiesquire commented 1 month ago

For reference, I ran the same test on a configuration that is the same as above, but has WOMBAT turned on. With all WOMBAT diagnostics turned off, these are the timings:

Includes #90 _USE_GENERIC_TRACER output000 output001 output002 output003 average
Yes Yes 1802.25 s 1638.26 s 1808.77 s 1749.80 s  1749.77 s

In this configuration DT_THERM = 3600.0, bearing in mind that WOMBAT has an internal timestep that defaults to 900 s.

So about 1.7x slower with WOMBAT than without which is pretty consistent with ACCESS-OM2.