ACCESS-NRI / access-om2-configs

Standard ACCESS-OM2 configurations released and supported by ACCESS-NRI
3 stars 4 forks source link

Scheduled Repro Check Failed for Config release-1deg_jra55_iaf_bgc-2.0 #112

Open github-actions[bot] opened 5 months ago

github-actions[bot] commented 5 months ago

There was a failure of a monthly reproducibility check on ACCESS-NRI/access-om2-configs. Logs, checksums and other artifacts can be found at the Failed Run Log link below.

Model: access-om2, found here: https://github.com/ACCESS-NRI/access-om2 Config Repo: access-om2-configs, found here: https://github.com/ACCESS-NRI/access-om2-configs Config Tag Tested for Reproducibility: release-1deg_jra55_iaf_bgc-2.0, found here: https://github.com/ACCESS-NRI/access-om2-configs/releases/tag/release-1deg_jra55_iaf_bgc-2.0 Failed Run Log: https://github.com/ACCESS-NRI/access-om2-configs/actions/runs/9326016176 Experiment Location (Gadi): /scratch/tm70/repro-ci/experiments/access-om2/release-1deg_jra55_iaf_bgc-2.0

Tagging @ACCESS-NRI/model-release

aidanheerdegen commented 5 months ago

Extracting and using the code from https://github.com/ACCESS-NRI/model-config-tests/blob/809122db808cb53cf1be0aabab179422408c80af/src/model_config_tests/test_bit_reproducibility.py#L108-L120

from model_config_tests.exp_test_helper import ExpTestHelper
from pathlib import Path

testdir=Path('/scratch/tm70/repro-ci/experiments/access-om2/release-1deg_jra55_iaf_bgc-2.0')

exp_2x1day = ExpTestHelper(testdir / 'control' / 'test_restart_repro_2x1day', testdir / 'lab')
exp_2day = ExpTestHelper(testdir / 'control' / 'test_restart_repro_2day', testdir / 'lab')

checksums_1d_0 = exp_2x1day.extract_checksums()
checksums_1d_1 = exp_2x1day.extract_checksums(exp_2x1day.output001)

checksums_2d = exp_2day.extract_checksums()

Output:

Unequal checksum: adic: -1280903534585776154
Unequal checksum: caco3: -1434922612388060372
Unequal checksum: alk: 4385096758226071011
Unequal checksum: dic: -2923014224314626136
Unequal checksum: no3: 2500362083272763959
Unequal checksum: phy: 1742031636856931459
Unequal checksum: o2: 8397596938827414417
Unequal checksum: fe: -104460091757843259
Unequal checksum: zoo: 3091091178031614202
Unequal checksum: det: 2083286045134757037
CodeGat commented 5 months ago

When checking other release configs, I note that only BGC configs have this issue of failing restart repro. For example: For release-01deg_jra55_ryf[_bgc] restart repro:

access-hive-bot commented 5 months ago

This issue has been mentioned on ACCESS Hive Community Forum. There might be relevant details there:

https://forum.access-hive.org.au/t/cosima-twg-meeting-minutes-2024/1734/10