Closed kshedstrom closed 2 months ago
The depth index for the temperature and salinity used to calculate density, kmld_ref
, is a float rather than an int. I think this was also present in ec631ef, so maybe it's not the problem, but just wanted to point that out.
@kshedstrom, thank you for reporting, and @andrew-c-ross, thank you for your prompt check and response. I agree that it's probably better to change kmld_ref
from real
to int
to avoid confusion. However, as @andrew-c-ross pointed out, it exists in both tags, so there may be something else causing the issue. I just did a quick test on Gaea using 1D, NWA, and NEP cases but could not reproduce this error.
@kshedstrom, would you mind providing more details? Did this error occur on Gaea or your HPC system? Also, what kind of compiler did you use in your case? Thank you.
@yichengt900 This is on gaea, the script: /ncrc/home2/Katherine.Hedstrom/ESMG/ESMG-configs/Arctic6/BGC/run62a.xml The detritus is around still too, under /gpfs/f5/cefi/scratch/Katherine.Hedstrom/work
This is mysterious. Are we sure this wasn't a system issue that the EOS happened to catch in COBALT? If you run the same simulation again, does it crash in the same place?
Nice catch on k_mld Andrew. It would be great if did not have to calculate densities or potential densities at all. This has been suggested with the MOM6 team, but there is some reluctance that I'm not sure I fully understand....
If I'm looking at the right run, the traceback includes
Image PC Routine Line Source
fms_MOM6_SIS2_COB 00000000022142DB Unknown Unknown Unknown
libpthread-2.31.s 00001515B8E89910 Unknown Unknown Unknown
libpthread-2.31.s 00001515B8E887D3 close Unknown Unknown
libdarshan.so.0.0 00001515B84223F3 close Unknown Unknown
libhdf5_intel.so. 00001515B882AC3B Unknown Unknown Unknown
libhdf5_intel.so. 00001515B8818692 H5FD_close Unknown Unknown
libhdf5_intel.so. 00001515B88009F3 Unknown Unknown Unknown
libhdf5_intel.so. 00001515B8801B04 H5F_try_close Unknown Unknown
libhdf5_intel.so. 00001515B8801779 H5F__close Unknown Unknown
libhdf5_intel.so. 00001515B8A68976 H5VL__native_file Unknown Unknown
libhdf5_intel.so. 00001515B8A51CD8 H5VL_file_close Unknown Unknown
libhdf5_intel.so. 00001515B87FB574 Unknown Unknown Unknown
libhdf5_intel.so. 00001515B889107C H5I_clear_type Unknown Unknown
libhdf5_intel.so. 00001515B87FB50E H5F_term_package Unknown Unknown
libhdf5_intel.so. 00001515B8727905 H5_term_library Unknown Unknown
libc-2.31.so 00001515B4456B79 Unknown Unknown Unknown
libc-2.31.so 00001515B4456D0A Unknown Unknown Unknown
libinfinipath.so. 00001515AEE04602 Unknown Unknown Unknown
libpthread-2.31.s 00001515B8E89910 Unknown Unknown Unknown
This sounds a lot like some of the filesystem problems we've been having.
@kshedstrom, thank you for sharing the XML. I noticed that for the MOM6 version, you were pointing to ESMG/MOM6 (which is up to date with dev/gfdl
).
It is suitable for the physics-only run, which includes the OBC fixes, but it will not work with COBALTv3.0 due to the lack of a few minor tracer interface changes (we simplified the changes back to January). In order to make MOM6 work with COBALTv3, unfortunately you will need to include these changes in MOM6 generic_tracer.F90.
You can either create a new branch with these changes in ESMG, or use dev/cefi MOM6 , which is almost identical to dev/gfdl
and dev/esmg
except for the interface changes.
I had three separate runs which all failed in the call to EOS.
I had a version with some interface changes, but I thought you had backed them out. How did this compile?
The new COBALTv3 requires geolat
(for day length calculation) and eqn_of_state
(for mixed layter density calculation), which need to be passed from MOM6 during generic_tracer_source call.
I checked the source codes you used under /gpfs/f5/cefi/scratch/Katherine.Hedstrom/fre/Arc_12
:
MOM6_SIS2_COBALT_compile_symm/src/mom6/src/MOM6/src/tracer/MOM_generic_tracer.F90
: no interface changeMOM6_SIS2_COBALT_compile_symm/src.20240403.134121/mom6/src/MOM6/src/tracer/MOM_generic_tracer.F90
: no interface changeMOM6_SIS2_COBALT_compile_symm_a/src/mom6/src/MOM6/src/tracer/MOM_generic_tracer.F90
: no interface changeMOM6_SIS2_compile_ciod/src/mom6/src/MOM6/src/tracer/MOM_generic_tracer.F90
: old interface changes before #7 I think the reason why you can still compile the code is that we set geolat
and eqn_of_state
as optional inputs. This allows the generic_tracer_source
to still work with COBALTv2.
So do I need to be updating to pass the EOS as an optional argument? Is that what's failing? Should it fail in a more informative way if it doesn't get the EOS?
Yes, you will need them if you are using COBALT from the latest dev/cefi ocean_BGC branch. @charliestock and @theresa-morrison , just keeping both you in the loop. If we decide to keep the density calculation within COBALTv3, we could consider adding a warning/fatal error as @kshedstrom suggested.
This issue has been address by PR#33. Will close it for now.
Yesterday's dev/cefi gave me this stack trace:
I recompiled with ec631ef65375de78c35a1891e67502b49b230507 and it runs past this (and runs into the imbalance issues).