E3SM-Project / E3SM

Energy Exascale Earth System Model source code. NOTE: use "maint" branches for your work. Head of master is not validated.
https://docs.e3sm.org/E3SM
Other
334 stars 334 forks source link

Use adjust_ps=false in Cxx theta-l_kokkos code #6063

Closed tcclevenger closed 2 months ago

tcclevenger commented 6 months ago

This is an incremental change to introduce adjustment of dp3d instead of adjustment of ps.

adjust_ps is still used for preqx (forcing was never rewritten), rsplit=0 (only adjust_ps is possible), or EAM (to not touch every test).

Fixed bugs in forcing_ut init, testing it only with adjust_ps=false option.

[non-BFB] for HOMME tests with baselines (with moist forcing and some without nontrivial forcing, because even zero forcing is processed differently for adjustment of dp3d).

rljacob commented 6 months ago

Maybe we want adjust_ps to be false in v3? @golaz or @wlin7 ?

rljacob commented 6 months ago

This adjust_ps logical has come up in other PRs (https://github.com/E3SM-Project/E3SM/pull/4717) and the comment is always "we're not going to change it for EAM because we don't want to change v2 answers". We're assembling v3 now so time to reconsider that?

Also this issue https://github.com/E3SM-Project/scream/issues/1343 says "E3SM should also migrate to adjust_ps=.false., but this is on hold in order to preserve buggy V2 behavior."

mt5555 commented 6 months ago

To Rob's question: Yes, EAM also has this bug and EAM should adopt adjust_ps=.false. But I didn't want to suggest it since it's a small effect and we are so close to the V3 deadlines.

ambrad commented 6 months ago

Will this be answer changing for SCREAM simulations?

tcclevenger commented 6 months ago

Will this be answer changing for SCREAM simulations?

When we merge for EAMxx it will be answer changing for C++, not otherwise (#ifdef SCREAM here still has same behavior).

ambrad commented 6 months ago

Ok, and everyone has agreed to that? I ask because it looks like the old method is being removed from C++, so we won't be able to switch back if there are issues.

oksanaguba commented 6 months ago

I was about to ask whether we want climo for this change, to document it, even if we can do climo only in eam. I ran climo for this long time ago, so the best would be to rerun https://acme-climate.atlassian.net/wiki/spaces/COM/pages/1632864081/Climo+6-year+runs+for+ps+adjustment+dp3d+adjustment+T+adjustment+choices .

mt5555 commented 6 months ago

Correct - (removing old code). This has been tested in SCREAM v0.1, but then the bug was accidentally turned back on in v1. so it hasn't had extensive testing in SCREAM v1, but it is low risk.

oksanaguba commented 6 months ago

Also, can we change the description a little? Something like

This PR addresses the issue that the moist pressure forcing was applied to ps instead of dp3d (design inherited from CAM).
The change affects all runs in standalone homme with forcing, but adjusting ps is hardcoded for EAM runs with F executable for now.
When this is pulled into scream, it will use dp3d adjustment in EAMXX, not ps adjustment.

nonbfb for homme runs with forcings against baselines, should be bfb for cxx-vs-f tests.

@tcclevenger did you run homme suites for this? if so, would you please post the output from chrysalis and weaver here? thanks!

oksanaguba commented 6 months ago

Though my comment above is suggestive, i actually think we should rerun climo for this change. If @tcclevenger does not want to, i can do it.

tcclevenger commented 6 months ago

@oksanaguba I have run all theta-f* tests on my workstation. I am struggling with weaver at the moment. Have you successfully run there since the drivers and modules were updated? I will continue to try, or move to summit.

And I do not have access to chrysalis.

oksanaguba commented 6 months ago

@tcclevenger i haven't used weaver for a while.

rljacob commented 5 months ago

Waiting on @tcclevenger to test on GPU.

tcclevenger commented 5 months ago

An update: I now have homme building on weaver, but the F90 theta-f* tests I'm trying to run appear to be hanging, so I'm not able to produce baselines or run the cxx_vs_f90 tests. I'm working on figuring this out.

As for summit, I'm running into an internal compiler error whose solution doesn't seem trivial. I'll most likely continue on the weaver path.

ambrad commented 5 months ago

Is the Summit ICE for the standalone-Homme build? Does it occur with this configuration?

machinefile=$e3sm/components/homme/cmake/machineFiles/summit-bfb.cmake
cmake -C $machinefile $e3sm/components/homme

This configuration worked for me with the master branch of a few weeks ago; you might check the master branch in addition to your branch.

You should also merge this PR into the SCREAM repo and run the CIME SCREAMv1 test suite, e.g., the specific test ERS_Ln90.ne30pg2_ne30pg2.F2010-SCREAMv1.

Another point: The ICE might depend on modules. You can source the .env_mach_specific.sh file from a CIME test to get an environment to use when building standalone Homme.

tcclevenger commented 5 months ago

@oksanaguba @ambrad Looks like my issue on summit was the modules! Loading them through CIME has it building without error. Thanks for the help!

tcclevenger commented 5 months ago

Ran the theta-f* tests on summit, all cxx_vs_f90 tests pass, the following tests fail with diffs in 5 of 21 fields (kokkos and f90 are identical).

I also merged into EAMxx and ran AT test suite on weaver and mappy. Weaver passed, Mappy failed with diffs in CIME cases (expected) with normalized diffs in range [2e-5, 1]. I also ran same CIME cases on summit and diffs existed in the same fields as CPU and the same values.

@mt5555 Do these numbers seem reasonable (if that is possible to know), or are there any simulations I need to run to test the output?

@oksanaguba can you run the Chrysalis tests. I don't have access yet.

oksanaguba commented 5 months ago

Ran climo for model-vs-model for EAM with the default setup (adjust_ps = true, master branch) and with adjust_ps = false. The climo is here https://web.lcrc.anl.gov/public/e3sm/diagnostic_output/onguba/theta/eam-ne30-dpadj.443759.0002-0006/def/viewer/ , not sure if these diffs are considered small, but they are documented.

tcclevenger commented 5 months ago

@oksanaguba Do you know if there are cxx vs. F90 tests using adjust_ps=false in nightly tests?

oksanaguba commented 5 months ago

@tcclevenger looking at the code, homme standalone tests with forcing were tested only with adjust_ps = true in F and xx. Related to this and to Mark's comment, when you test this PR, only baseline comparisons should fail, but cxx-vs-F should not fail (both now will be using adjust_ps = false). There is a comment in the PR description that "unit tests are now nonbfb" -- not sure what you mean. Unit tests run smaller chuncks of code and test cxx kernels against F code, often only in the setup that does not cover all possibilities. If you see nonbfb in forcing unit test, that means that 1) these is a bug in cxx code, or 2) params were not set up exactly . Either way it has to be addressed. The same is for cxx-vs-f tests, they shgould always pass. The only fails should be here from comparisons with baselines.

tcclevenger commented 5 months ago

@oksanaguba Yes, I tested all the theta-f*ne2 tests. The fails were in

theta-f1-tt10-hvs1-hvst0-r2-qz10-nutopoff-GB-sl-ne2-nu3.4e18-ndays1
theta-f1-tt10-hvs1-hvst0-r2-qz10-nutopoff-GB-sl-kokkos-ne2-nu3.4e18-ndays1
theta-fhs1-ne2-nu3.4e18-ndays1
theta-fhs1-kokkos-ne2-nu3.4e18-ndays1
theta-fhs2-ne2-nu3.4e18-ndays1
theta-fhs2-kokkos-ne2-nu3.4e18-ndays1
theta-fhs3-ne2-nu3.4e18-ndays1
theta-fhs3-kokkos-ne2-nu3.4e18-ndays1

which were all baseline tests. All the cxx_vs_f90 tests pass. I'm confident the F90 matches the C++, but neither match their baselines on master (which is expected).

mt5555 commented 5 months ago

adding some notes from conversation with @oksanaguba

baseline DIFFs expected for all standalone HOMME tests that include forcing and have rsplit>0. Thus all the diffs in the theta-fhs* tests above are expected.

For HOMME tests without forcing (such as the one below) we would expect no differences. However, with this PR the forcing code does some calculations with a different code path, applying a forcing of strength 0. Thus it is possible that it introduces roundoff changes due to the different code path. if the baselines agree on Chrysalis, I think it is safe to conclude this baseline diff on summit is roundoff

theta-f1-tt10-hvs1-hvst0-r2-qz10-nutopoff-GB-sl-ne2-nu3.4e18-ndays1 theta-f1-tt10-hvs1-hvst0-r2-qz10-nutopoff-GB-sl-kokkos-ne2-nu3.4e18-ndays1

mt5555 commented 5 months ago

revised PR looks great.

@oksanaguba will run the test suite on Chrysalis and then I think this is ready to go!

oksanaguba commented 5 months ago

An update: There are issues with testing on chrysalis, i am trying to fix forcing ut for now.

oksanaguba commented 5 months ago

This PR fails on chrysalis in forcing_ut and in some sl tests. For the forcing test, the issue is not that F and cxx are not bfb (i believe they are), but that answers become nans. Nans appear because hydrostatic pressure is computed in applycam_forcing twice, but in different ways. To compute pprime, first phydro is computed via

  do k=1,nlev
      phydro(:,:,k)=hvcoord%ps0*hvcoord%hyam(k) + ps(:,:)*hvcoord%hybm(k)
   enddo

which uses hybrid coefs. Then phydro is computed via

         ! recompute hydrostatic pressure from dp3d
         call get_hydro_pressure(phydro,elem%state%dp3d(:,:,:,np1),hvcoord)

to add pprime back. The issue is that in unit tests, dp3d, ps, and hybrid coefs are not in sync. In unit tests, pprime is way off if phydro is computed via A, B (because in unit tests random init does not use A and B). When pprime is added back after moisture adjustment, and pprime is sometimes negative, it can lead to negative pressure (because phydro via get_hydro_pressure is very different from the one from A, B).

1) One obvious solution is to sync random init in unit tests -- that is, use random ps value and compute dp from A and B (right now, dp is randomized instead).

2) Strategically I think a better solution is to rewrite

phydro(:,:,k)=hvcoord%ps0*hvcoord%hyam(k) + ps(:,:)*hvcoord%hybm(k)

to use dp, not hybrid. This is not a nonbfb change, but we should move away from using hybrid there. Note that overall it is not a bug (yet), because the code in EAM/scream is called when dp is in sync with hybrid coefficients.

I can see that solution #1 will be picked up, so I would ask @tcclevenger to implement it.

I will look at the other fails next.

tcclevenger commented 5 months ago

Thanks @oksanaguba! I'll work on implementing solution 1

oksanaguba commented 5 months ago

To add why this was not triggered before -- it is because both codes, F and cxx were using only adjust_ps=true option. After this PR forcing_ut will only check adjust_ps=false option.

oksanaguba commented 5 months ago

@tcclevenger i forgot to mention one detail: we need this fix in forcing_ut.cpp file, too (this is because in F remap factor by default is -1 and in cxx it is actually 0)

  // Init everything through singleton, which is what happens in normal runs
  auto& c = Context::singleton();
  auto& p = c.create<SimulationParams>();

  p.dt_remap_factor = 1;       <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< THIS

Can you push it into your branch? thanks.

oksanaguba commented 5 months ago

Failing tests are theta-f1-tt10-hvs1-hvst0-r2-qz10-nutopoff-GB-sl-ne?-nu3.4e18-ndays1 and fails are against baselines, not cxx-vs-F. These test have zero forcing and they are NH. First i was alarmed that only SL tests fail, but turns out tests theta-f1-tt10-hvs1-hvst0-r2-qz10-nutopoff-GB-ne?-nu3.4e18-ndays1 (not SL), and other no-forcing, NH or HY, not-SL tests do not set use_moisture=true. Turns out use_moisture=true leads to differences in runs with adjust_ps=true and adjust_ps=false in test theta-f1-tt10-hvs1-hvst0-r2-qz10-nutopoff-GB-ne2-nu3.4e18-ndays1 (not SL). use_moisture in the code affects how homme forcings will be computed, because all depend on dp, and if use_moisture=false, then dp/pressures are not recomputed.

Both options, adjust_ps=true and adjust_ps=false, should lead to near zero forcing in homme, but because of how homme uses forcing (converting T tend into homme state tendencies), homme tendencies from zero FQ and FT won't necessarily be identically zero. One issue is that there are nonlinear relationships between FT, FQ and FVtheta and FPhi. I actually could not trace the differences between adjust_ps=true and adjust_ps=false at the very first step (gave up before trying to figure out whether fortran prints all significant digits), but the differences show up at the 2nd call to applycamforcing_tracers.

i am more confident now that the PR behaves as expected, but we should switch to the "wet" option in non-SL tests as well,

diff --git a/components/homme/test/reg_test/namelists/theta.nl b/components/homme/test/reg_test/namelists/theta.nl
index c67923edeb..90876d48e2 100644
--- a/components/homme/test/reg_test/namelists/theta.nl
+++ b/components/homme/test/reg_test/namelists/theta.nl
@@ -35,6 +35,7 @@ hypervis_subcycle_tom  = ${HOMME_TEST_HVS_TOM}
 theta_hydrostatic_mode = ${HOMME_THETA_HY_MODE}
 theta_advect_form = ${HOMME_THETA_FORM}
 tstep_type        = ${HOMME_TTYPE}
+moisture = 'wet'
 /
 &solver_nl
 precon_method = "identity"
tcclevenger commented 4 months ago

@oksanaguba What is an appropriate range for randomizing ps values?

I'm backing working on this today so hopefully we will wrap it up soon. Thanks for all your help!

oksanaguba commented 4 months ago

@tcclevenger you could take ps from, say, 800 to 1200 -- i believe you mean randomized state routine.

tcclevenger commented 4 months ago

@oksanaguba I added a new state.randomize() function which randomizes ps and computes dp, and I have the forcing_ut.cpp tests using that new randomize function. Also set dt_remap_factor=1 for those forcing tests.

Did you want me to switch to the "wet" option in non-SL tests?

oksanaguba commented 4 months ago

Conrad, we just talked with Mark about easiest way forward and he suggested in F code wrapping line

phydro(:,:,k)=hvcoord%ps0*hvcoord%hyam(k) + ps(:,:)*hvcoord%hybm(k)    /// Line 1

with an ifdef (say, using HOMMEBFB ifdef), to switch between the original line and calling get_hydro_pressure() there. In CXX code we would replace Line 1 with get_hydro_pressure() call without ifdefs. (confirming that get_hydro_pressure() and Line 1 produce the same result is a separate small task) However seems you already fixed initializing dp, so we probably should stick to your changes. I will test your change tonight.

Separately, I will give it another try to narrow down where the code with adjust_ps=false diverges from baselines, so, do not change namelist just yet.

Could you instead follow Andrew's suggestion and run the EPS test in SCREAM with adjust_ps change?

oksanaguba commented 4 months ago

I am seeing memory access errors while trying to run new version of forcing_ut :

...
 -> hydrostatic mode: false
   -> moisture: dry
     -> adjustment: true
     -> adjustment: false
   -> moisture: moist
     -> adjustment: true
double free or corruption (!prev)

Would you be able to locate the issue?

tcclevenger commented 4 months ago

@oksanaguba Found the issue! Turns out a local buffer view m_pi_i was only being allocated if m_hydrostatic=true, but when m_adjust_ps=false we need this buffer view if m_hydrostatic=false as well. I think it was machine specific for the forcing ut, so on weaver we got lucky. And for the thetaf tests, I'm guessing other classes had larger memory buffers so we were lucky that the request overall was large enough.

I'll work on running ERS test on GPU tomorrow.

tcclevenger commented 4 months ago

@oksanaguba Ran ERS test in SCREAM with these changes in homme and we pass.

tcclevenger commented 4 months ago

@oksanaguba What still needs to be looked at for this? Is there anything I can do on my end?

oksanaguba commented 4 months ago

@tcclevenger i want to check one more time which code line(s) actually make adjust_ps vs adjust_dp cases diverge when forcing is zero. that is something i tried to do before, but did not succeed. i plan to get back to this at the end of this week.

oksanaguba commented 3 months ago

Regarding nonbfb behavior for adjust_ps=true vs false with use_moisture=true, it is due to this code

#ifdef MODEL_THETA_L
   if (use_moisture) then
      ! compute updated pnh and exner
      if (adjust_ps) then
         ! recompute hydrostatic pressure from ps
         do k=1,nlev
            phydro(:,:,k)=hvcoord%ps0*hvcoord%hyam(k) + ps(:,:)*hvcoord%hybm(k)
         enddo
      else
         ! recompute hydrostatic pressure from dp3d
         call get_hydro_pressure(phydro,elem%state%dp3d(:,:,:,np1),hvcoord)
      endif

that is, phydro is computed differently. when using test theta-f1-tt10-hvs1-hvst0-r2-qz10-nutopoff-GB-ne2-nu3.4e18-ndays1 these differences not show up during the 1st time step (not sure why, not like the values are represented as decimals)

OG phydro in forcing  4  3  8  0.58593750000000000E+04
OG phydro in forcing  4  3  9  0.66406250000000000E+04
OG phydro in forcing  4  3 10  0.74218750000000000E+04

but show up in the 2nd time step

934459,934460c934459,934460
< OG phydro in forcing  1  1  8  0.58593940740563385E+04
< OG phydro in forcing  1  1  9  0.66406466172638511E+04
---
> OG phydro in forcing  1  1  8  0.58593940740563394E+04
> OG phydro in forcing  1  1  9  0.66406466172638502E+04

I do not see anything else that would be a concern.

oksanaguba commented 3 months ago

i am running homme suite on chrysalis, after that this will be ready.

oksanaguba commented 3 months ago

@tcclevenger forcing test passed for me when running a few times by hand, but failed in test suite, with seed 1206257716 . So i hardcoded the seed into forcing_ut.cpp and it does fail, with nans (not like results differ). last time we saw it it was about init of ps and init of dp3d mismatch (dp3d did not get init-ed from A,B). you fixed that, so i am not sure what else is there. one of us would need to debug.

oksanaguba commented 3 months ago

@tcclevenger i believe one issue with the new code is that this init of dp is not what we want:

       Kokkos::parallel_for(Kokkos::ThreadVectorRange(kv.team,NUM_LEV),
                            [&](const int ilev) {
        dp(ie,tl,igp,jgp,ilev) = ps0*hybrid_am(ilev)
                               + ps(ie,tl,igp,jgp)*hybrid_bm(ilev);

See, for example, this correct code for dp from A,B:

  do k = 1 , nlev
    dp_ref(k) = ( hvcoord%hyai(k+1) - hvcoord%hyai(k) ) * hvcoord%ps0 + &
         ( hvcoord%hybi(k+1) - hvcoord%hybi(k) ) * ps  !Reference pressure difference

I tried this fix:

+++ b/components/homme/src/theta-l_kokkos/cxx/ElementsState.cpp
@@ -285,8 +285,15 @@ void ElementsState::randomize(const int seed,
   auto dp = m_dp3d;
   auto ps = m_ps_v;
   auto ps0 = hvcoord.ps0;
-  auto hybrid_am = hvcoord.hybrid_am; 
-  auto hybrid_bm = hvcoord.hybrid_bm; 
+  //auto hybrid_am = hvcoord.hybrid_am; 
+  //auto hybrid_bm = hvcoord.hybrid_bm; 
+
+  // Create local copies, to avoid issue of lambda on GPU
+  auto hyai = hvcoord.hybrid_ai_packed;
+  auto hybi = hvcoord.hybrid_bi_packed;
+  auto dhyai = hvcoord.hybrid_ai_delta;
+  auto dhybi = hvcoord.hybrid_bi_delta;
+
   const auto tu = m_tu;
   Kokkos::parallel_for(m_policy, KOKKOS_LAMBDA(const TeamMember& team) {
     KernelVariables kv(team, tu);
@@ -297,11 +304,21 @@ void ElementsState::randomize(const int seed,
                          [&](const int idx) {
       const int igp  = idx / NP;
       const int jgp  = idx % NP;
+
+      ColumnOps::compute_midpoint_delta(kv,hyai,dhyai);
+      ColumnOps::compute_midpoint_delta(kv,hybi,dhybi);
+
       Kokkos::parallel_for(Kokkos::ThreadVectorRange(kv.team,NUM_LEV),
                            [&](const int ilev) {
-        dp(ie,tl,igp,jgp,ilev) = ps0*hybrid_am(ilev)
-                               + ps(ie,tl,igp,jgp)*hybrid_bm(ilev);
+
+        dp(ie,tl,igp,jgp,ilev) = ps0*dhyai(ilev)
+                               + ps(ie,tl,igp,jgp)*dhybi(ilev);

but it also did not work (meaning it seems that there is still a mismatch between this dp and dp as computed from p(A,B,ps)). So I printed averages of Ai vs Am and for me they do not match (but they should). We can re-iterate on Tuesday in case i made mistakes.

tcclevenger commented 3 months ago

Thanks for looking at this, @oksanaguba. I can try to do some debugging this week!

oksanaguba commented 3 months ago

This resolved it for me, that is, now hydro pressure computed from dp coincides with the pressure from A,B:

diff --git a/components/homme/src/share/cxx/HybridVCoord.cpp b/components/homme/src/share/cxx/HybridVCoord.cpp
index 569eef1d68..199b6bc4f7 100644
--- a/components/homme/src/share/cxx/HybridVCoord.cpp
+++ b/components/homme/src/share/cxx/HybridVCoord.cpp
@@ -158,8 +158,8 @@ void HybridVCoord::random_init(int seed) {

     Errors::runtime_check(curr>prev,"Error! hybrid_a+hybrid_b is not increasing.\n", -1);

-    host_hybrid_am_real(i-1) = (host_hybrid_ai(i) + host_hybrid_ai(i))/2.0;
-    host_hybrid_bm_real(i-1) = (host_hybrid_bi(i) + host_hybrid_bi(i))/2.0;
+    host_hybrid_am_real(i-1) = (host_hybrid_ai(i) + host_hybrid_ai(i-1))/2.0;
+    host_hybrid_bm_real(i-1) = (host_hybrid_bi(i) + host_hybrid_bi(i-1))/2.0;
   }

Please commit the changes from above (hv init and dp calculations), run on gpu standalone forcing test (i assume there is no need to run ERS test, since the above does not touch forcing functor), and then i will re-test multiple times on chrysalis. Thanks.

tcclevenger commented 3 months ago

@oksanaguba Implemented your changes, test with that specific seed before and after and went from failing to passing. Tested forcing ut on weaver V100 gpu and builds and runs without errors and test passes.

rljacob commented 3 months ago

discussion: Oksana will test one more time.

oksanaguba commented 2 months ago

Running hommebfb on chrysalis as expected

     30 - thetanhwet-TC (Failed)
     39 - thetanh-moist-bubble (Failed)
     43 - thetah-sl-dcmip16_test1pg2 (Failed)
     47 - thetanh-moist-bubble-kokkos (Failed)
     53 - thetanh-moist-bubble-sl (Failed)
     54 - thetanh-moist-bubble-sl-kokkos (Failed)
     55 - thetanh-moist-bubble-sl-pg2 (Failed)
     56 - thetanh-moist-bubble-sl-pg2-kokkos (Failed)
     75 - theta-f1-tt10-hvs1-hvst0-r2-qz10-nutopoff-GB-sl-ne2-nu3.4e18-ndays1 (Failed)
     76 - theta-f1-tt10-hvs1-hvst0-r2-qz10-nutopoff-GB-sl-kokkos-ne2-nu3.4e18-ndays1 (Failed)
     77 - theta-fhs1-ne2-nu3.4e18-ndays1 (Failed)
     78 - theta-fhs1-kokkos-ne2-nu3.4e18-ndays1 (Failed)
     79 - theta-fhs2-ne2-nu3.4e18-ndays1 (Failed)
     80 - theta-fhs2-kokkos-ne2-nu3.4e18-ndays1 (Failed)
     81 - theta-fhs3-ne2-nu3.4e18-ndays1 (Failed)
     82 - theta-fhs3-kokkos-ne2-nu3.4e18-ndays1 (Failed)
    114 - theta-f1-tt10-hvs1-hvst0-r2-qz10-nutopoff-GB-sl-ne6-nu1.25e17-ndays1 (Failed)
    115 - theta-f1-tt10-hvs1-hvst0-r2-qz10-nutopoff-GB-sl-kokkos-ne6-nu1.25e17-ndays1 (Failed)
    116 - theta-fhs1-ne6-nu1.25e17-ndays1 (Failed)
    117 - theta-fhs1-kokkos-ne6-nu1.25e17-ndays1 (Failed)
    118 - theta-fhs2-ne6-nu1.25e17-ndays1 (Failed)
    119 - theta-fhs2-kokkos-ne6-nu1.25e17-ndays1 (Failed)
    120 - theta-fhs3-ne6-nu1.25e17-ndays1 (Failed)
    121 - theta-fhs3-kokkos-ne6-nu1.25e17-ndays1 (Failed)
oksanaguba commented 2 months ago

also forcing up passed 100 times:

[ac.onguba@chr-0115 thetal_kokkos_ut]$ for i in $(seq 1 100); do srun -n 1 ./forcing_ut | grep "All tests passed" ; done | wc -l
100
oksanaguba commented 2 months ago

This PR is ready.

oksanaguba commented 2 months ago
  ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 (Overall: PASS) details:
    PASS ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 CREATE_NEWCASE
    PASS ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 XML
    PASS ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 SETUP
    PASS ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 SHAREDLIB_BUILD time=179
    PASS ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 NLCOMP
    PASS ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 MODEL_BUILD time=1082
    PASS ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 SUBMIT
    PASS ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 RUN time=93
    PASS ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 COMPARE_base_rest
    PASS ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 BASELINE master:
    FAIL ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 MEMCOMP [Errno 2] No such file or directory: '/lcrc/group/e3sm/baselines/chrys/intel/master/ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2/cpl-mem.log'
    FAIL ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 TPUTCOMP [Errno 2] No such file or directory: '/lcrc/group/e3sm/baselines/chrys/intel/master/ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2/cpl-tput.log'
    PASS ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 MEMLEAK
    PASS ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_ftype2 SHORT_TERM_ARCHIVER