Closed ekluzek closed 1 month ago
Note, I'm also seeing this for the Bgc NoAnthro test:
SMS_D_Ld3_PS.f09_g17.I1850Clm60BgcNoAnthro.derecho_intel.clm-decStart1851_noinitial--clm-matrixcnOn
so this problem is somehow linked to the NoAnthro setup and not just whether the test is Sp or Bgc vs BgcCrop.
In the standup @samsrabin suggested in the discussion for a couple ideas to try:
This patch should solve it, although I haven't even tested whether it builds:
diff --git a/src/biogeochem/FireEmisFactorsMod.F90 b/src/biogeochem/FireEmisFactorsMod.F90
index e97082c0b..7f7f470f3 100644
--- a/src/biogeochem/FireEmisFactorsMod.F90
+++ b/src/biogeochem/FireEmisFactorsMod.F90
@@ -11,6 +11,7 @@ module FireEmisFactorsMod
use shr_kind_mod, only : r8 => shr_kind_r8
use abortutils, only : endrun
use clm_varctl, only : iulog
+ use clm_varpar, only : maxveg
!
implicit none
private
@@ -20,8 +21,6 @@ module FireEmisFactorsMod
public :: fire_emis_factors_init
public :: fire_emis_factors_get
-! !PRIVATE MEMBERS:
- integer :: npfts ! number of plant function types
!
type emis_eff_t
real(r8), pointer :: eff(:) ! emissions efficiency factor
@@ -73,10 +72,7 @@ contains
call endrun(errmes)
endif
- factors(:npfts) = comp_factors_table( ndx )%eff(:npfts)
- if ( size(factors) > npfts )then
- factors(npfts+1:) = comp_factors_table( ndx )%eff(nc3crop)
- end if
+ factors(:maxveg) = comp_factors_table( ndx )%eff(:maxveg)
molecwght = comp_factors_table( ndx )%wght
end subroutine fire_emis_factors_get
@@ -126,9 +122,8 @@ contains
call ncd_inqdlen( ncid, dimid, n_comps, name='Comp_Num')
call ncd_inqdlen( ncid, dimid, n_pfts, name='PFT_Num')
- npfts = n_pfts
- if ( npfts /= mxpft .and. npfts /= 16 )then
- call endrun('Number of PFTs on fire emissions file is NOT correct. Its neither the total number of PFTS nor 16')
+ if ( n_pfts < maxveg )then
+ call endrun('Number of PFTs on fire emissions file is less than the number of PFTs in the run')
end if
ierr = pio_inq_varid(ncid,'Comp_EF', comp_ef_vid)
@@ -146,7 +141,7 @@ contains
call bld_hash_table_indices( comp_names )
do i=1,n_comps
start=(/i,1/)
- count=(/1,npfts/)
+ count=(/1,n_pfts/)
ierr = pio_get_var( ncid, comp_ef_vid, start, count, comp_factors )
call enter_hash_data( trim(comp_names(i)), comp_factors, comp_molecwghts(i) )
I replicated the issue with a standard 16-pft dataset (so not a NoAnthro one) on Izumi with the nag compiler as follows:
i017.cgd.ucar.edu:mpi_rank_29][error_sighandler] Caught error: Aborted (signal 6)
Runtime Error: /fs/cgd/data0/erik/ctsm_worktree/quickfix/src/biogeochem/FireEmisFactorsMod.F90, line 76: Subscript 1 of by ESMAPP
[i017.cgd.ucar.edu:mpi_rank_10][error_sighandler] Caught error: Aborted (signal 6)
FACTORS (value 78) is out of range (1:16)400: Called by CLM_INSTMOD:CLM_INSTINIT
/fs/cgd/data0/erik/ctsm_worktree/quickfix/src/main/clm_initializeMod.F90, li
Program terminated by fatal error
/fs/cgd/data0/erik/ctsm_worktree/quickfix/src/biogeochem/FireEmisFactorsMod.F90, line 76: Error occurred in FIREEMISFACTORSMOD:FIRE_EMIS_FACTORS_GET
/fs/cgd/data0/erik/ctsm_worktree/quickfix/src/biogeochem/CNFireEmissionsMod.F90, ne 409: Called by CLM_INITIALIZEMOD:INITIALIZE2
/fs/cgd/data0/erik/ctsm_worktree/quickfix/src/cpl/nuopc/lnd_comp_nuopc.F90, line 659: Called bline 68: Called by CNFIREEMISSIONSMOD:INIT
/fs/cgd/data0/erik/ctsm_worktree/quickfix/src/main/clm_instMod.F90, line 400: Cally LND_COMP_NUOPC:INITIALIZEREALIZE
/fs/cgd/data0/erik/ctsm_worktree/quickfix/components/cmeps/cime_config/../cesm/driver/esmApp.F90, line 128: Called by ESMAPP
ed by CLM_INSTMOD:CLM_INSTINIT
/fs/cgd/data0/erik/ctsm_worktree/quickfix/src/main/clm_initializeMod.F90, line 409: Called by CLM_INITIALIZEMOD:INITIALIZE2
/fs/cgd/data0/erik/ctsm_worktree/quickfix/src/cpl/nuopc/lnd_comp_nuopc.F90, line 659: Called by LND_COMP_NUOPC:INITIALIZEREALIZE
/fs/cgd/data0/erik/ctsm_worktree/quickfix/components/cmeps/cime_config/../cesm/driver/esmApp.F90, line 128: Called by ESMAPP
[i017.cgd.ucar.edu:mpi_rank_21][error_sighandler] Caught error: Aborted (signal 6)
[i017.cgd.ucar.edu:mpi_rank_0][error_sighandler] Caught error: Aborted (signal 6)
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 243471 RUNNING AT i017.cgd.ucar.edu
= EXIT CODE: 134
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
This was resolved in ctsm5.3.0
Brief summary of bug
SMS_D_Ld3_PS.f09_g17.I1850Clm60SpNoAnthro.derecho_intel.clm-decStart1851_noinitial
test fails due to a glitch in fire-emissions which were turned on for in the ctsm5.3.0 prototype.
General bug information
CTSM version you are using: branch_tags/ctsm5.3.n03_ctsm5.2.028-20-g317dc11d0
Does this bug cause significantly incorrect results in the model's science? No
Configurations affected:
Some Sp simulations with -fire-emis on and the fire_emission_factors_78PFTs_c20240624.nc file
I don't see what's different about this test from the ones that pass
Here's the list of Sp tests with fire_emis on that pass:
ERP_D_Ld3_PS.f09_g17.I2000Clm50Sp.derecho_intel.clm-prescribed ERP_D_Ld5.f10_f10_mg37.I2000Clm60Sp.derecho_intel.clm-decStart ERP_D_Ld5.f10_f10_mg37.IHistClm45Sp.derecho_intel.clm-decStart ERP_D_Ld5.f10_f10_mg37.IHistClm50SpCru.derecho_gnu.clm-drydepnomegan ERP_D_Ld5.f10_f10_mg37.IHistClm60Sp.derecho_intel.clm-default ERP_D_Ld5.ne30pg3_t232.IHistClm51Sp.derecho_intel.clm-default ERP_P64x2_D.f10_f10_mg37.I2000Clm50SpRtmFl.derecho_intel.clm-default ERP_P64x2_D_Ld10.f10_f10_mg37.IHistClm50SpG.derecho_intel.clm-glcMEC_decrease ERP_P64x2_D_Ld5.f10_f10_mg37.I2000Clm45Sp.derecho_intel.clm-default ERP_P64x2_D_Ld5.f10_f10_mg37.I2000Clm50Sp.derecho_gnu.clm-default ERS_D_Ld10.f10_f10_mg37.IHistClm50Sp.derecho_intel.clm-collapse_pfts_78_to_16_decStart_f10 NCK_Ld1.f10_f10_mg37.I2000Clm50Sp.derecho_intel.clm-default SMS_D_Ld1_Mmpi-serial.f45_f45_mg37.I2000Clm50SpRs.derecho_intel.clm-ptsRLA SMS_D_Ld1_PS.f09_g17.I1850Clm50Sp.derecho_intel.clm-default SMS_D_Ld1_PS.f19_f19_mg17.I2010Clm50Sp.derecho_intel.clm-clm50cam6LndTuningMode SMS_D_Ln9_P128x3.f19_g17.IHistClm50Sp.derecho_intel.clm-waccmx_offline SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60SpRs.derecho_intel.clm-default--clm-NEON-TOOL SMS_Lm37.f10_f10_mg37.I1850Clm50SpG.derecho_intel.clm-glcMEC_long SMS_Ln9.f10_f10_mg37.I2000Clm50Sp.derecho_gnu.clm-clm50cam5LndTuningModeZDustSoilErod SMS_Ln9.ne30pg2_ne30pg2_mg17.I1850Clm50Sp.derecho_intel.clm-clm50cam6LndTuningMode SMS_Ln9.ne3pg3_ne3pg3_mg37.I2000Clm50Sp.derecho_gnu.clm-clm50cam6LndTuningMode SMS_P384x2_D_Ld5.f19_g17.I2000Clm50Sp.derecho_intel.clm-default
Details of bug
It turns out we normally run with -fire_emis on for almost all of our tests (except FATES tests). Note that when you run with Sp compsets the coupler fire variables are just output as missing so there really isn't a good reason to run Sp compsets with fire_emis on.
Important output or errors that show the problem