COSIMA / cice5

Clone of The Los Alamos sea ice model (CICE) with ACCESS drivers. See https://github.com/CICE-Consortium/CICE-svn-trunk/tree/cice-5.1.2
4 stars 12 forks source link

Can't output Tinz #72

Open aekiss opened 11 months ago

aekiss commented 11 months ago

As reported here, ACCESS-OM2 1deg_jra55_ryf aborts when f_tinz is anything other than ‘x’. It aborts at the first time the data would be written.

Abort with message Unknown Error: Unrecognized error code in file /g/data/v45/aek156/CHUCKABLE/access-om2/src/cice5/ParallelIO/src/clib/pio_darray_int.c at line 687

This is the offending line: https://github.com/NCAR/ParallelIO/blob/7e242f78bd1b4766518aff44fda17ff50eed6188/src/clib/pio_darray_int.c#L687

Possibly related: https://github.com/COSIMA/cice5/issues/62#issuecomment-1097556730

It has been possible to output Tinz in other runs, e.g. 0.1° IAF.

access-hive-bot commented 11 months ago

This issue has been mentioned on ACCESS Hive Community Forum. There might be relevant details there:

https://forum.access-hive.org.au/t/output-internal-sea-ice-temperatures-from-access-om2/1531/10

aekiss commented 11 months ago
Image              PC                Routine            Line        Source
cice_auscom_360x3  0000000000D21214  Unknown               Unknown  Unknown
libpthread-2.28.s  0000146A6CE2CCF0  Unknown               Unknown  Unknown
libucp.so.0.0.0    0000146A585EAEF8  ucp_worker_progre     Unknown  Unknown
hmca_bcol_ucx_p2p  0000146A461440D6  Unknown               Unknown  Unknown
hmca_bcol_ucx_p2p  0000146A46143190  Unknown               Unknown  Unknown
libhcoll.so.1.0.1  0000146A4BD0B0EC  hmca_coll_ml_barr     Unknown  Unknown
mca_coll_hcoll.so  0000146A5001DCDA  mca_coll_hcoll_ba     Unknown  Unknown
mca_sharedfp_lock  0000146A3A354F30  mca_sharedfp_lock     Unknown  Unknown
libmca_common_omp  0000146A3B7CBC87  mca_common_ompio_     Unknown  Unknown
mca_io_ompio.so    0000146A3B9D7CC1  mca_io_ompio_file     Unknown  Unknown
libmpi.so.40.20.2  0000146A6E4CC498  PMPI_File_set_vie     Unknown  Unknown
libhdf5.so.103.1.  0000146A6A9A1E64  Unknown               Unknown  Unknown
libhdf5.so         0000146A6A7AC2AF  H5FD_write            Unknown  Unknown
libhdf5.so         0000146A6A787A32  H5F__accum_write      Unknown  Unknown
libhdf5.so         0000146A6A8A4B24  H5PB_write            Unknown  Unknown
libhdf5.so         0000146A6A79337B  H5F_block_write       Unknown  Unknown
libhdf5.so         0000146A6A736369  H5D__chunk_alloca     Unknown  Unknown
libhdf5.so.103.1.  0000146A6A747081  Unknown               Unknown  Unknown
libhdf5.so         0000146A6A74CD0D  H5D__alloc_storag     Unknown  Unknown
libhdf5.so         0000146A6A74EE3C  H5D__set_extent       Unknown  Unknown
libhdf5.so         0000146A6A725452  H5Dset_extent         Unknown  Unknown
libnetcdf.so       0000146A6E157EF1  NC4_put_vars          Unknown  Unknown
libnetcdf.so       0000146A6E1573BF  NC4_put_vara          Unknown  Unknown
libnetcdf.so.18.0  0000146A6E0F9CE7  Unknown               Unknown  Unknown
libnetcdf.so       0000146A6E0FAF79  nc_put_vara           Unknown  Unknown
cice_auscom_360x3  00000000009A68F5  Unknown               Unknown  Unknown
cice_auscom_360x3  00000000009A45E0  Unknown               Unknown  Unknown
cice_auscom_360x3  00000000009AAE01  Unknown               Unknown  Unknown
cice_auscom_360x3  00000000009668C0  Unknown               Unknown  Unknown
cice_auscom_360x3  00000000008F841D  Unknown               Unknown  Unknown
cice_auscom_360x3  00000000006E2F23  ice_history_write        1223  ice_history_write.f90
cice_auscom_360x3  0000000000694EC4  ice_history_mp_ac        2023  ice_history.f90
cice_auscom_360x3  000000000041633F  cice_runmod_mp_ci         411  CICE_RunMod.f90
cice_auscom_360x3  0000000000411292  MAIN__                     70  CICE.f90
aekiss commented 11 months ago

The config uses cice_auscom_360x300_24p_edcfa6f_libaccessom2_d750b4b.exe.

There are several versions of ice_history_write.f90 in the codebase. I expect https://github.com/COSIMA/cice5/blob/master/io_pio/ice_history_write.F90 is the one that's being used here, but the line number 1223 doesn't make sense.

aidanheerdegen commented 11 months ago

the line number 1223 doesn't make sense.

All those codes are preprocessed to the .f90 versions for compilation, which can change line numbers (if code is included). In this case it would just have some blank lines I believe, but worth checking the .f90 versions nonetheless.

In the past I have also found line numbers for optimised code to be unreliable, and only -O0 optimisation was required to get accurate line numbers. I don't know if that is still the case.

aekiss commented 11 months ago

1deg_jra55_ryf can't output Tinz using

/g/data/ik11/inputs/access-om2/bin/cice_auscom_360x300_24p_edcfa6f_libaccessom2_d750b4b.exe

But 01deg_jra55v140_iaf_cycle4 did output Tinz successfully in 2014 using the older exe

/g/data/ik11/inputs/access-om2/bin/cice_auscom_18x15.3600x2700_1682p_3a5d05f_libaccessom2_0ab7295.exe

These are the cice5 3a5d05f...edcfa6f and libaccessom2 0ab7295...d750b4b differences. Note that the libaccessom2 version has changed, but apparently not in a way that would affect answers.

Perhaps commit https://github.com/COSIMA/cice5/commit/1a98130e64c1e2c87993846e7ff05836f3ab590a is to blame?

I tried using the old, working version of the 1deg exe in 1deg_jra55_ryf

/g/data/ik11/inputs/access-om2/bin/cice_auscom_360x300_24p_3a5d05f_libaccessom2_0ab7295.exe

but this fails with

 ice_read_nc_xy: Cannot find variable ssn_i

because this is not a BGC run and it's missing https://github.com/COSIMA/cice5/commit/27bcc454cf292710c996c10e87218ec1e4c4bb7e.

So I tried again with a BGC config: first git checkout master+bgc and then use this in config.yaml:

      exe: /g/data/ik11/inputs/access-om2/bin/cice_auscom_360x300_24p_3a5d05f_libaccessom2_0ab7295.exe

but this dies with

Abort with message NetCDF: HDF error in file /home/156/aek156/github/COSIMA/access-om2-new/src/cice5/ParallelIO/src/clib/pio_darray_int.c at line 721

unless f_tinz = 'x'. So how come this CICE version worked for 01deg_jra55v140_iaf_cycle4 but not here? Does it also need mushy ice (ktherm=2)?

anton-seaice commented 2 months ago

I cloned https://github.com/ACCESS-NRI/access-om2-configs/tree/release-1deg_jra55_ryf and set:

    f_sinz            = 'md'
    f_tinz            = 'md'

in cice_in.nml

I ran for one month and both the daily and monthly output just worked :) So I wonder if this has been inadvertently fixed? Possibly the issue has been fixed in one of the dependencies? (i.e. hdf5?) Or the build is subtly different in a way that works now.