NOAA-CEFI-Regional-Ocean-Modeling / ocean_BGC

3 stars 6 forks source link

model run crash with signal 11 at initialization #80

Closed nikizadehgfdl closed 2 weeks ago

nikizadehgfdl commented 1 month ago

I tried to run an existing ocean-ice COBALTv2 model after swapping the "master" branch of NOAA ocean_BGC repo with dev/cefi of this repo. The model crashes while initializing with signal 11 (which usually points to memory issues):

 ==>Note from generic_COBALT(generic_COBALT_register):Using Mocsy CO2 routine                                                                                                                                                            
fms_MOM6_SIS2_compile_cobv3.x:246076 terminated with signal 11 at PC=1589f6b SP=7ffe40770fe0.  Backtrace:  

Unfortunately no further traceback is printed so I have no idea where the issue occurs, other than it's before the main loop.

This happens for both "debug" and "prod" modes.

The same also happens when I tried a fully coupled model with om5 (as a prototype for ESM4.5).

Note that I am taking MOM6/SIS2 from their own repo, not from CEFI, and apply the following fix in two places to MOM_generic_tracer.F90

       call generic_tracer_source(tv%T, tv%S, rho_dzt, dzt, dz_ml, G%isd, G%jsd, 1, dt, &
                G%areaT, get_diag_time_end(CS%diag), &
                optics%nbands, optics%max_wavelength_band, optics%sw_pen_band, optics%opacity_band, &
-               internal_heat=tv%internal_heat, frunoff=fluxes%frunoff, sosga=sosga)
+               internal_heat=tv%internal_heat, frunoff=fluxes%frunoff, sosga=sosga, &
+               geolat=G%geolatT, eqn_of_state=tv%eqn_of_state)

Do I need to fix MOM6/SIS2 in more places?

kshedstrom commented 1 month ago

The full patch:

diff --git a/config_src/external/GFDL_ocean_BGC/generic_tracer.F90 b/config_src/external/GFDL_ocean_BGC/generic_tracer.F90
index 42c386497..75cd57a08 100644
--- a/config_src/external/GFDL_ocean_BGC/generic_tracer.F90
+++ b/config_src/external/GFDL_ocean_BGC/generic_tracer.F90
@@ -6,6 +6,8 @@ module generic_tracer

   use g_tracer_utils, only : g_tracer_type, g_diag_type

+  use MOM_EOS,           only: EOS_type
+
   implicit none ; private

   public generic_tracer_register
@@ -68,7 +70,7 @@ contains
   !> Calls the corresponding generic_X_update_from_source routine for each package X
   subroutine generic_tracer_source(Temp,Salt,rho_dzt,dzt,hblt_depth,ilb,jlb,tau,dtts,&
        grid_dat,model_time,nbands,max_wavelength_band,sw_pen_band,opacity_band,internal_heat,&
-       frunoff,grid_ht, current_wave_stress, sosga)
+       frunoff,grid_ht, current_wave_stress, sosga, geolat, eqn_of_state)
     integer,                        intent(in) :: ilb    !< Lower bounds of x extent of input arrays on data domain
     integer,                        intent(in) :: jlb    !< Lower bounds of y extent of input arrays on data domain
     real, dimension(ilb:,jlb:,:),   intent(in) :: Temp   !< Potential temperature [deg C]
@@ -94,6 +96,8 @@ contains
     real, dimension(ilb:,jlb:),optional,  intent(in) :: grid_ht !< Unknown, and presently unused by MOM6
     real, dimension(ilb:,jlb:),optional , intent(in) :: current_wave_stress !< Unknown, and presently unused by MOM6
     real,                      optional , intent(in) :: sosga !< Global average sea surface salinity [ppt]
+    real, dimension(ilb:,jlb:),optional,  intent(in) :: geolat !< Latitude
+    type(EOS_type),            optional,  intent(in) :: eqn_of_state !< A pointer to the equation of state
   end subroutine generic_tracer_source

   !> Update the tracers from bottom fluxes
diff --git a/src/tracer/MOM_generic_tracer.F90 b/src/tracer/MOM_generic_tracer.F90
index f430e9451..23ab810ae 100644
--- a/src/tracer/MOM_generic_tracer.F90
+++ b/src/tracer/MOM_generic_tracer.F90
@@ -582,7 +582,7 @@ contains
       call generic_tracer_source(tv%T, tv%S, rho_dzt, dzt, dz_ml, G%isd, G%jsd, 1, dt, &
                G%areaT, get_diag_time_end(CS%diag), &
                optics%nbands, optics%max_wavelength_band, optics%sw_pen_band, optics%opacity_band, &
-               internal_heat=tv%internal_heat, frunoff=fluxes%frunoff, sosga=sosga)
+               internal_heat=tv%internal_heat, frunoff=fluxes%frunoff, sosga=sosga, geolat=G%geolatT, eqn_of_state=tv%eqn_of_state)
     else
       call generic_tracer_source(US%C_to_degC*tv%T, US%S_to_ppt*tv%S, rho_dzt, dzt, dz_ml, G%isd, G%jsd, 1, dt, &
                G%US%L_to_m**2*G%areaT(:,:), get_diag_time_end(CS%diag), &
@@ -590,7 +590,7 @@ contains
                sw_pen_band=G%US%QRZ_T_to_W_m2*optics%sw_pen_band(:,:,:), &
                opacity_band=G%US%m_to_Z*optics%opacity_band(:,:,:,:), &
                internal_heat=G%US%RZ_to_kg_m2*US%C_to_degC*tv%internal_heat(:,:), &
-               frunoff=G%US%RZ_T_to_kg_m2s*fluxes%frunoff(:,:), sosga=sosga)
+               frunoff=G%US%RZ_T_to_kg_m2s*fluxes%frunoff(:,:), sosga=sosga, geolat=G%geolatT, eqn_of_state=tv%eqn_of_state)
     endif

     ! This uses applyTracerBoundaryFluxesInOut to handle the change in tracer due to freshwater fluxes
nikizadehgfdl commented 1 month ago

Thanks @kshedstrom , I have the patch for src/tracer/MOM_generic_tracer.F90 as above. In the experiment I am trying the config_src/external/GFDL_ocean_BGC/generic_tracer.F90 is not compiled.

yichengt900 commented 1 month ago

Hmm, interesting. @nikizadehgfdl, would you mind pointing me to your experiment folder so I can take a look when I get a chance? Thanks!

kshedstrom commented 1 month ago

Sorry, I didn't read your message carefully enough. I am getting this compile failure on chinook:

mpif90 -Duse_libMPI -Duse_netCDF -DSPMD -DUSE_LOG_DIAG_FIELD_INFO -D_FILE_VERSION="`//import/c1/AKWATERS/kate/ESMG/ESMG-configs/src/mkmf/bin/git-version-string //import/c1/AKWATERS/kate/ESMG/ESMG-configs/src/ocean_BGC/generic_tracers/generic_COBALT.F90`" -DSTATSLABEL=\"gnu\" -DMAXFIELDMETHODS_=500 -Duse_AM3_physics -D_USE_LEGACY_LAND_ -DUSE_FMS2_IO -Duse_yaml -D_USE_GENERIC_TRACER -DMAX_FIELDS_=100 -DUSE_PRECISION=2 -DNOT_SET_AFFINITY -D_USE_MOM6_DIAG -Duse_netCDF -DHAVE_SCHED_GETAFFINITY  -I/usr/local/pkg/MPI/GCC/11.3.0/OpenMPI/4.1.4/netCDF-Fortran/4.5.4/include -I/usr/local/pkg/MPI/GCC/11.3.0/OpenMPI/4.1.4/netCDF-Fortran/4.5.4/include  -fcray-pointer -fdefault-double-8 -fdefault-real-8 -Waliasing -ffree-line-length-none -fno-range-check -I/usr/local/pkg/MPI/GCC/11.3.0/OpenMPI/4.1.4/netCDF-Fortran/4.5.4/include -fallow-invalid-boz -fallow-argument-mismatch -O2 -fbounds-check -I../../shared/repro  -c -I//import/c1/AKWATERS/kate/ESMG/ESMG-configs/src/MOM6/src/framework  //import/c1/AKWATERS/kate/ESMG/ESMG-configs/src/ocean_BGC/generic_tracers/generic_COBALT.F90
//import/c1/AKWATERS/kate/ESMG/ESMG-configs/src/ocean_BGC/generic_tracers/generic_COBALT.F90:140:30:

  140 |   use fms_mod,           only: open_namelist_file, close_file
      |                              1
Error: Symbol ‘open_namelist_file’ referenced at (1) not found in module ‘fms_mod’
//import/c1/AKWATERS/kate/ESMG/ESMG-configs/src/ocean_BGC/generic_tracers/generic_COBALT.F90:140:50:

  140 |   use fms_mod,           only: open_namelist_file, close_file
      |                                                  1
Error: Symbol ‘close_file’ referenced at (1) not found in module ‘fms_mod’
//import/c1/AKWATERS/kate/ESMG/ESMG-configs/src/ocean_BGC/generic_tracers/generic_COBALT.F90:254:11:

  254 |     ioun = open_namelist_file()
      |           1
Error: Function ‘open_namelist_file’ at (1) has no IMPLICIT type
andrew-c-ross commented 1 month ago

@kshedstrom I think you need -DINTERNAL_FILE_NML in your compiler flags. We probably need a less clumsy solution for this.

andrew-c-ross commented 1 month ago

@nikizadehgfdl do you have the new COBALT input/override input.nml section and the _input/_override files?

&cobalt_input_nml
        parameter_filename = 'INPUT/COBALT_input',
                             'INPUT/COBALT_override'
/
nikizadehgfdl commented 1 month ago

@andrew-c-ross thanks, no, I do not have those files, and I bet that's the problem. Can you point me to a workdir or an xml that have these files so I can figure out what to put in them?

andrew-c-ross commented 1 month ago

@nikizadehgfdl https://github.com/NOAA-GFDL/CEFI-regional-MOM6/blob/60944973f8e54585af9c8249b10bf913312ddbdb/xmls/NWA12/CEFI_NWA12_cobalt.xml#L598 Currently almost everything we are using is the default so the COBALT input and override are basically empty. One exception is some regional models have do_case2_mod = True

nikizadehgfdl commented 1 month ago

@kshedstrom thanks for bringing this up. I think that whole #ifdef INTERNAL_FILE_NML block is a thing of the past and should be replaced by

read (input_nml_file, nml=generic_COBALT_nml, iostat=io_status)
ierr = check_nml_error(io_status,'generic_COBALT_nml')

All other components do it this way. I'll make a PR for that.

nikizadehgfdl commented 1 month ago

I made it past the signal 11 by adding COBALT_input file. If the files are required to exist, a check and a graceful crash would be nice.

Now I am dealing with those new tracers that are added (mu_mem_ndm , ...) since I do not have restarts for them. Are there global "source" files for them to initialize from?

kshedstrom commented 1 month ago

I believe many can be safely set to zero.

yichengt900 commented 1 month ago

Hi @nikizadehgfdl, I've included @feida6996 in this discussion, who is running COBALTv3 in the global domain. @feida6996, could you provide the initial conditions (ICs) for the 0.25-degree global ocean you are using?

feida6996 commented 1 month ago

Hi @nikizadehgfdl and @yichengt900, I used the following initial conditions for 1/4 degree COBALTv3 experiment:

Temperature and salinity: /gpfs/f5/gfdl_med/world-shared/global/WOA/WOA18/woa18_decav_ptemp_monthly_fulldepth_01.nc /gpfs/f5/gfdl_med/world-shared/global/WOA/WOA18/woa18_decav_s_monthly_fulldepth_01.nc

NO3, PO4, SIO4, and oxygen: /gpfs/f5/gfdl_med/world-shared/global/WOA/WOA18/woa18_all_n00_01.nc /gpfs/f5/gfdl_med/world-shared/global/WOA/WOA18/woa18_all_p00_01.nc /gpfs/f5/gfdl_med/world-shared/global/WOA/WOA18/woa18_all_i00_01.nc /gpfs/f5/gfdl_med/world-shared/global/WOA/WOA18/woa18_all_o00_01.nc

Preindustrial TA and DIC: /gpfs/f5/gfdl_a/world-shared/cmip6/datasets/ESM4/OBGC/GLODAPv2/GLODAPv2.2016b.oi-filled.20180322.nc

Other BGC fields (from ESM4p-COBALT Preindustrial Control history file): /gpfs/f5/gfdl_med/world-shared/global/initial/cobalt_tracer_source_4P_varn2p.nc

Something for cobalt new schemes + nh3 ocean/atm exchange (I'm not familiar with this, but it is in the XML shared by Andrew): /gpfs/f5/gfdl_a/world-shared/cmip6/datasets/ESM4/OBGC/IC/COBALT/init_ocean_cobalt_nh3.res.nc