NOC-MSM / SE-NEMO

Shelf Enabled Global NEMO
GNU General Public License v3.0
2 stars 1 forks source link

4.0.4 run.stat from master does not match 4.0.4 run.stat from 4.2 branch #133

Open jdha opened 11 months ago

jdha commented 11 months ago

4.0.4 run.stat from master does not match 4.0.4 run.stat from 4.2 branch

jdha commented 11 months ago

Update: run.stat reproducibility issues using master with gnu-mpich at -O2, -O1 and -O0; also with -finit-local-zero. Tried with both 68 and 19 node options.

Next step: re-run tests with Cray

jdha commented 11 months ago

Same happens with the Cray compiler. A quick test with the GYRE test config looks to be fine.

jdha commented 11 months ago

I had another look at the reproducibility issue yesterday and figured out it was in the ice code. I had thought I’d initialised all variables to zero – but as I’d only done this test with GNU I thought I’d check with the Cray compiler. I recompiled and it was reproducible. So I guess I don’t know my compiler options very well or at least don’t know how to look them up!

For gfortran I used: -finit-local-zero … but maybe I’ve missed something. For the Cray I used -e0 and this seemed to solve my issue. Not sure it’s worth me going through the array of metO modifications to the ice code to see what’s going on as we’re going to move to 4.2.1 soon.

atb299 commented 11 months ago

@jdha do you know if it is v4 or v4.2 that doesn't initialise to zero correctly? If the latter then it could be worth flagging. Also note that a bug has been discovered in the v4.2 ice-ocean drag that may be important (https://forge.nemo-ocean.eu/nemo/nemo/-/issues/333) . It is considered important enough to trigger a 4.2.2 release (very soon). In the meanwhile it's 2 lines in iceupdate.F90, and I've added it to the NPD repo.

jdha commented 11 months ago

@atb299 I'm naively assuming it's a GO8 thing as both 4.0.4 and 4.2.1 pass SETTE (but that doesn't mean all code is fully tested)

jdha commented 11 months ago

another quick look at this:

not found the source yet but if you set ln_pnd_alb in namelist_ice_cfg_template to false you should get reproducibility

I've also tested that if ln_pnd_alb is true and you set zafrac_pnd = 0._wp in place of zafrac_pnd = MIN( pafrac_pnd(ji,jj,jl), 1._wp - zafrac_snw ) on ln 132 of icealb.F90 you can also get reproducibility

jdha commented 11 months ago

there are two calls to ice_alb:

ice_update_flx and ice_sbc_flx

it appears that the reproducibility issue occurs in the latter - although from what I can see all inputs have been initialised

jdha commented 10 months ago

For info: repeating this test with the 4.2.1 code passes the run.stat test

jdha commented 10 months ago

run.stat for master (4.0.4) and 4.2 branch (using 4.0.4) are identical for a short run of 20 time step (with ln_pnd_alb=.false.,)