ACCESS-NRI / reproducibility

Framework and tools for reproducibility testing of models
GNU General Public License v3.0
0 stars 0 forks source link

Testing model output reproducibility #5

Closed aidanheerdegen closed 7 months ago

aidanheerdegen commented 1 year ago

(this issue is created from this comment https://github.com/ACCESS-NRI/ACCESS-OM/issues/6#issuecomment-1248045967)

The ACCESS-OM spack builds need to be tested for correctness. The best possible outcome would be if when run the spack builds produced bitwise reproducible output.

COSIMA does regular run and bitwise reproducibility tests with their code. Links to the tests are visible in status badges on the repo

https://accessdev.nci.org.au/jenkins/job/ACCESS-OM2/job/reproducibility/

This clones https://github.com/COSIMA/access-om2.git and runs:

module use /g/data/hh5/public/modules && module load conda/analysis3-unstable && python -m pytest -s test/test_bit_reproducibility.py

The test is in the repository

https://github.com/COSIMA/access-om2/blob/master/test/test_bit_reproducibility.py

Specifically this is the part that opens the existing log file, pulls out the checksums and compares them to the checksums just produced

https://github.com/COSIMA/access-om2/blob/43568e56f4a043075f5f07efaeefbca9a444406f/test/test_bit_reproducibility.py#L89-L91

This is the truth output

https://github.com/COSIMA/access-om2/blob/master/test/checksums/1deg_jra55_iaf-access-om2.out

So it should be possible to use the same run logic as that in the tests

https://github.com/ACCESS-NRI/ACCESS-OM/issues/6#issuecomment-1248045967

and then extract the checksums and compare them.

aekiss commented 1 year ago

Run reproducibility may depend on how dependencies were compiled by spack - e.g. esmf compilation flags were breaking reproducibilty in access-om3

aidanheerdegen commented 1 year ago

The idea is to grab the bits of code from the COSIMA testing that are useful and make them available on gadi so we can run some testing interactively.

This is a first step in working out what testing we need to do and understand how the testing works so we can implement a new CI pipeline.