Open penguian opened 7 years ago
rb4844 changed component from ACCESS model
to ACCESS-CM2
pbd562@nci.org.au commented
Compiling MOM with fp-precise does not fix the issue.
pbd562@nci.org.au changed _comment0 which not transferred by tractive
pbd562@nci.org.au changed priority from major
to minor
@scott.wales@bom.gov.au commented
I believe this is expected behaviour, as collective MPI operations are not bit-reproducible across different processor decompositions (as floating point operations are not commutative, and different decompositions evaluate the individual ranks in different orders)
From Nic's comment it sounded like there is a compile-time option for using reproducible MPI operations
@martin.dix@anu.edu.au commented
Suite u-an301 can do comparisons across different processor decompositions. The MOM build was modified to set the REPRO flag, so build options were
-Duse_netCDF -Duse_netCDF3 -Duse_libMPI -DACCESS -DACCESS_CM -fpp -Wp,-w -fno-alias
-safe-cray-ptr -fpe0 -ftz -assume byterecl -i4 -8e3262e565652ac69b4b02b09b064c4f88b8c8e2 -traceback -nowarn -check noarg_temp_created
-assume buffered_io -convert big_endian -O2 -debug minimal -no-vec -fp-model precise
I followed Marshall's suggestion and added
do_bitwise_exact_sum=.true.
to
[namelist:ocean_barotropic_nml]
[namelist:ocean_density_nml]
[namelist:ocean_grids_nml]
[namelist:ocean_mixdownslope_nml]
[namelist:ocean_overexchange_nml]
[namelist:ocean_overflow_nml]
[namelist:ocean_rivermix_nml]
[namelist:ocean_sbc_nml]
[namelist:ocean_tracer_diag_nml]
[namelist:ocean_velocity_diag_nml]
UM and CICE used the same processor decomposition in each run and MOM used 8x12 and 12x8.
The results were different after one day.
keyword_MOM
keyword_reproducibility
| by rb4844ACCESS-CM2 current configuration assigns 8 x 12 cpus to MOM. Changing this to 12 x 8 works but the surface temperature results are affected.
Issue migrated from trac:312 at 2024-01-31 18:28:31 +1100