E3SM-Project / scream

Fork of E3SM used to develop exascale global atmosphere model written in C++
https://e3sm-project.github.io/scream/
Other
80 stars 55 forks source link

With homme_shoc_cld_p3_rrtmgp stand-alone test, error when using ne2/ne30 #1623

Closed ndkeen closed 2 years ago

ndkeen commented 2 years ago

Originally thought this was a problem with larger ne, but I can repeat with ne2, so I must be doing something wrong. I'm trying on PM in debug with 1 rank (on ne2) using master of April27th.

In this dir: bld/tests/coupled/dynamics_physics/homme_shoc_cld_p3_rrtmgp

The error mesg in standard out:

0: [EAMXX] initialize_atm_procs ... done!
0: Start time stepping loop...       [  0%]
0: Atmosphere step = 0
0:   model time = 2021-10-12 12:30:00
0: 
0: 
0: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0: homme_shoc_cld_p3_rrtmgp is a Catch v2.13.8 host application.
0: Run with -? for options
0: 
0: -------------------------------------------------------------------------------
0: scream_homme_physics
0: -------------------------------------------------------------------------------
0: /global/cfs/cdirs/e3sm/ndk/se02-apr27/components/scream/tests/coupled/dynamics_physics/homme_shoc_cld_p3_rrtmgp/homme_shoc_cld_p3_rrtmgp.cpp:32
0: ...............................................................................
0: 
0: /global/cfs/cdirs/e3sm/ndk/se02-apr27/components/scream/tests/coupled/dynamics_physics/homme_shoc_cld_p3_rrtmgp/homme_shoc_cld_p3_rrtmgp.cpp:44: FAILED:
0:   {Unknown expression after the reported line}
0: due to a fatal error condition:
0:   SIGABRT - Abort (abnormal termination) signal
0: 
0: ===============================================================================
0: test cases: 1 | 1 failed
0: assertions: 2 | 1 passed | 1 failed

And in standard error:

0: /global/cfs/cdirs/e3sm/ndk/se02-apr27/components/homme/src/share/cxx/PpmRemap.hpp:535: lambda []()->auto::operator()()->auto: block: [5,0,0], thread: [0,11,0] Assertion `fabs(m_pio(kv.ie, igp, jgp, NUM_PHYSICAL_LEV) - m_pin(kv.ie, igp, jgp, NUM_PHYSICAL_LEV)) < 1.0` failed.
0: /global/cfs/cdirs/e3sm/ndk/se02-apr27/components/homme/src/share/cxx/PpmRemap.hpp:535: lambda []()->auto::operator()()->auto: block: [5,0,0], thread: [0,8,0] Assertion `fabs(m_pio(kv.ie, igp, jgp, NUM_PHYSICAL_LEV) - m_pin(kv.ie, igp, jgp, NUM_PHYSICAL_LEV)) < 1.0` failed.

We've seen this assert error before (early on), but I think it was resolved by using updated IC files (maybe something like it didn't have the right number of vertical levels)

input.yaml:

%YAML 1.1
---
Debug:
  Atmosphere DAG Verbosity Level: 5

Time Stepping:
  Time Step: 1800
  Start Time: [12, 30, 00]      # Hours, Minutes, Seconds
  Start Date: [2021, 10, 12]    # Year, Month, Day
  Number of Steps: 4

Initial Conditions:
  Physics GLL:
    Filename: /global/cfs/cdirs/e3sm/inputdata/atm/scream/init/homme_shoc_cld_p3_rrtmgp_init_ne2np4.nc
    #Filename: /global/cfs/cdirs/e3sm/bhillma/scream/data/init/screami_ne30np4L72_20220503.nc
    #Filename: /global/cfs/cdirs/e3sm/bhillma/scream/data/init/screami_ne120np4L72_20220503.nc
    surf_latent_flux: 0.0
    surf_sens_flux: 0.0
    aero_g_sw: 0.0
    aero_ssa_sw: 0.0
    aero_tau_sw: 0.0
    aero_tau_lw: 0.0

Atmosphere Processes:
  Number of Entries: 5
  Schedule Type: Sequential
  Process 0:
    Process Name: Homme
    Enable Precondition Checks: false
    Vertical Coordinate Filename: /global/cfs/cdirs/e3sm/inputdata/atm/scream/init/homme_shoc_cld_p3_rrtmgp_init_ne2np4.nc
    #Vertical Coordinate Filename: /global/cfs/cdirs/e3sm/bhillma/scream/data/init/screami_ne30np4L72_20220503.nc
    #Vertical Coordinate Filename: /global/cfs/cdirs/e3sm/bhillma/scream/data/init/screami_ne120np4L72_20220503.nc
    Moisture: moist
  Process 1:
    Process Name: SHOC
    Grid: Physics GLL
  Process 2:
    Process Name: CldFraction
    Grid: Physics GLL
  Process 3:
    Process Name: P3
    Grid: Physics GLL
  Process 4:
    Process Name: RRTMGP
    Grid: Physics GLL
    active_gases: ["h2o", "co2", "o3", "n2o", "co" , "ch4", "o2", "n2"]

Grids Manager:
  Type: Dynamics Driven
  Reference Grid: Physics GLL
  Dynamics Driven:
    Dynamics Namelist File Name: namelist.nl

# The parameters for I/O control
Scorpio:
  Output YAML Files: ["homme_shoc_cld_p3_rrtmgp_output.yaml"]
...
ndkeen commented 2 years ago

With CPU nodes of PM, I don't see this error (but a different one), so will close.