ESMCI / cime

Common Infrastructure for Modeling the Earth
http://esmci.github.io/cime
Other
162 stars 207 forks source link

ERI tests failing for me in CTSM #4589

Closed ekluzek closed 9 months ago

ekluzek commented 9 months ago

With cime6.0.217_httpsbranch01 in CTSM (https://github.com/ESCOMP/CTSM/pull/2385) the ERI tests are failing in the test list.

ERI_C2_Ld9.f10_f10_mg37.I2000Clm51BgcCrop.derecho_gnu.clm-default
ERI_D.ne30pg3_t232.I1850Clm51BgcCrop.derecho_intel.clm-clm51cam6LndTuningMode
ERI_D_Ld9.f10_f10_mg37.I1850Clm45Bgc.derecho_gnu.clm-default
ERI_D_Ld9.f10_f10_mg37.I1850Clm51Bgc.derecho_gnu.clm-default
ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-default
ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default
ERI_D_Ld9.ne30_g17.I2000Clm50BgcCru.derecho_intel.clm-vrtlay
ERI_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-default
ERI_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-drydepnomegan
ERI_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default
ERI_Ld9.f45_g37.I2000Clm50BgcCru.derecho_intel.clm-nofire

ERI_D_Ld9_P48x1.T31_g37.I2000Clm50Sp.izumi_nag.clm-reduceOutput
ERI_D_Ld9_P48x1.f10_f10_mg37.I2000Clm50BgcCru.izumi_nag.clm-reduceOutput
ERI_D_Ld9_P48x1.f10_f10_mg37.I2000Clm50Sp.izumi_nag.clm-SNICARFRC

They fail with the following kind of message...

 ---------------------------------------------------
2024-02-21 22:51:39: Exception during run:
[Errno 2] No such file or directory: '/glade/derecho/scratch/erik/tests_ctsm51d166erikb4bacl/archive/ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default.GC.ctsm51d166erikb4bacl_int.ref1/rest/2000-01-02-00000/ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default.GC.ctsm51d166erikb4bacl_int.ref1.clm2.rh0.2000-01-02-00000.nc' -> '/glade/derecho/scratch/erik/tests_ctsm51d166erikb4bacl/ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default.GC.ctsm51d166erikb4bacl_int.ref2/run/ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default.GC.ctsm51d166erikb4bacl_int.ref1.clm2.rh0.2000-01-02-00000.nc'
Traceback (most recent call last):
  File "/glade/work/erik/ctsm_worktrees/external_updates/cime/CIME/SystemTests/system_tests_common.py", line 282, in run
    self.run_phase()
  File "/glade/work/erik/ctsm_worktrees/external_updates/cime/CIME/SystemTests/eri.py", line 186, in run_phase
    _helper(dout_sr1, refdate_2, refsec_2, rundir2)
  File "/glade/work/erik/ctsm_worktrees/external_updates/cime/CIME/SystemTests/eri.py", line 34, in _helper
    os.symlink(item, dst)
FileNotFoundError: [Errno 2] No such file or directory: '/glade/derecho/scratch/erik/tests_ctsm51d166erikb4bacl/archive/ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default.GC.ctsm51d166erikb4bacl_int.ref1/rest/2000-01-02-00000/ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default.GC.ctsm51d166erikb4bacl_int.ref1.clm2.rh0.2000-01-02-00000.nc' -> '/glade/derecho/scratch/erik/tests_ctsm51d166erikb4bacl/ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default.GC.ctsm51d166erikb4bacl_int.ref2/run/ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default.GC.ctsm51d166erikb4bacl_int.ref1.clm2.rh0.2000-01-02-00000.nc'

 ----------

This simple fix gets it to work.

diff --git a/CIME/SystemTests/eri.py b/CIME/SystemTests/eri.py
index 272a3881a..18c69e187 100644
--- a/CIME/SystemTests/eri.py
+++ b/CIME/SystemTests/eri.py
@@ -181,6 +181,8 @@ def run_phase(self):
             clone2.set_value("HIST_N", hist_n)

         rundir2 = clone2.get_value("RUNDIR")
+        if not os.path.exists(rundir2):
+            os.makedirs(rundir2)
         dout_sr2 = clone2.get_value("DOUT_S_ROOT")

         _helper(dout_sr1, refdate_2, refsec_2, rundir2)

Which looks like the right thing to do, because similar logic is already being done for rundir below this.

I'm guessing that something was changed such that rundir2 isn't created at this point, but was in the past. But, in any case the above gets it to work.

ekluzek commented 9 months ago

@fischer-ncar and @jedwards4b should I make the PR for this to the cime6.0.217_httpsbranch branch or to master?

ekluzek commented 9 months ago

ERI tests were working in cesm2_3_alpha17a using cime6.0.175, so it must be some change past that point.

jedwards4b commented 9 months ago

This is fixed already in tag 02 of the same branch

ekluzek commented 9 months ago

Ahhh, OK it's added into the helper function rather than before each use of it. Which is better. You could also remove the part that's for rundir a little later.

I also didn't see that tag in the testdb and had to fetch all the tags to see it in my clone.

But, in any case, I'm closing this as fixed in cime6.0.217_httpsbranch02