ufs-community / ufs-srweather-app

UFS Short-Range Weather Application
Other
55 stars 116 forks source link

Verification tasks fail on Cheyenne #933

Closed mkavulich closed 11 months ago

mkavulich commented 11 months ago

Expected behavior

All WE2E tests found in comprehensive.cheyenne should complete on Cheyenne.

Current behavior

Many verification tests currently fail on Cheyenne. This is due to a problem with loading the run_vx.local module. An example log file showing the failure is copied below.

Machines affected

Cheyenne only. Derecho tests run successfully.

Steps To Reproduce

  1. Attempt to run any WE2E test from the verification directory
  2. Observe the failure in verification tasks.

Additional Information (optional)

I honestly have no idea what the issue is here, but maybe a problem due to Cheyenne still using HPC-stack rather than Spack-stack? But then that doesn't explain why Derecho is working.

Output

From the failed task run_MET_Pb2nc_obs in the WE2E test grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16:

+ /glade/scratch/kavulich/FIRE/esmf_fork/update3/ufs-srweather-app/ush/load_modules_run_task.sh run_vx /glade/scratch/kavulich/FIRE/esmf_fork/update3/ufs-srweather-app/jobs/JREGIONAL_RUN_MET_PB2NC_OBS

Loading modules for task "run_vx" ...
Lmod has detected the following error: Syntax error in file:
/glade/scratch/kavulich/FIRE/esmf_fork/update3/ufs-srweather-app/modulefiles/tasks/cheyenne/run_vx.local.lua
 with command: setenv, one or more arguments are not strings.

While processing the following module(s):
    Module fullname  Module Filename
    ---------------  ---------------
    run_vx.local     /glade/scratch/kavulich/FIRE/esmf_fork/update3/ufs-srweather-app/modulefiles/tasks/cheyenne/run_vx.local.lua

/glade/scratch/kavulich/FIRE/esmf_fork/update3/ufs-srweather-app/ush/bash_utils/print_msg.sh: line 192: BASH_SOURCE[1]: unbound variable
FATAL ERROR:
ERROR:
  From script:  ""
  Full path to script:  ""
Loading .local module file (in directory specified by modules_dir) for the
specified task (task_name) failed:
  task_name = "run_vx"
  modulefile_local = "run_vx.local"
  modules_dir = "/glade/scratch/kavulich/FIRE/esmf_fork/update3/ufs-srweather-app/modulefiles/tasks/cheyenne"
Exiting with nonzero status.
natalie-perlin commented 11 months ago

@mkavulich - could you please give it a try for Met verif. tasks on Cheyenne for intel compilers? met/10.1.2 and metplus/4.1.3 modules rebuilt for Cheyenne intel.

(fixes for Cheyenne gnu are coming)

mkavulich commented 11 months ago

@natalie-perlin Thanks for your work there, the verification tests are now succeeding for both Intel and GNU on Cheyenne!