CESM-Development / cime

Common Infrastructure for Modeling the Earth
Other
16 stars 13 forks source link

PEA tests exit with non-zero status despite passing #157

Closed billsacks closed 8 years ago

billsacks commented 9 years ago

From cesm1_4_beta07 (but I believe I have been seeing this, or similar, behavior since Mariana's big refactor 6 months ago):

This test:

PEA_P1_M.f09_g16.TGHIST.yellowstone_pgi

gives:

$ cat cesm.stderr.255988 WARNING getTiming2: 1 nprocs - running in mpiserial mode so no additional timing to be reported , exiting ls: No match. WARNING getTiming2: 1 nprocs - running in mpiserial mode so no additional timing to be reported , exiting cp: missing destination file operand after /glade/scratch/sacks/cesm_baselines/cism2_1_02_beta07/PEA_P1_M.f09_g16.TGHIST.yellowstone_pgi/timing_summary' Trycp --help' for more information. BASEGEN_CPLPROFFILE: Undefined variable.

The test itself passes, but presumably some post-test thing is failing.

sholly commented 9 years ago

Looks like the _M confopt is broken in the current cime...


Setting up the following test:

testcase: PEA_P1_M grid: f09_g16 compset: TGHIST machine: yellowstone compiler: pgi Uncaught exception from user code: M option found but no MPILIB provided at /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/create_newcase line 701. at /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/create_newcase line 701 main::_set_confopts('_P1_M', 'ConfigCase=HASH(0x13aa7d0)') called at /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/create_newcase line 443

File specifying possible compsets: /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/components/cism/cime_config/config_compsets.xml Primary component (specifies possible compsets, pelayouts and pio settings): cism Compset: HIST_SATM_DLND%SCPL_SICE_SOCN_SROF_CISM1_SWAV Found machine "yellowstone" in /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/cime_config/cesm/machines/config_machines.xml confopts = _P1_M confopts pecount set to 1 invocation of create_newcase failed: create_newcase command was /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/create_newcase -silent -case /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/PEA_P1_M.f09_g16.TGHIST.yellowstone_pgi.150917-151717 -res f09_g16 -mach yellowstone -compset TGHIST -testname PEA -confopts _P1_M -mach_dir /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/cime_config/cesm/machines -compiler pgi Setting up tools for test case.. Use of uninitialized value $cimeroot in concatenation (.) or string at ./create_test line 1410. Use of uninitialized value $cimeroot in concatenation (.) or string at ./create_test line 1415. Use of uninitialized value $case in concatenation (.) or string at ./create_test line 1415. cp: cannot stat `/scripts/Testing/Testcases/tests_build.csh': No such file or directory cp -f /scripts/Testing/Testcases/tests_build.csh /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/PEA_P1_M.f09_g16.TGHIST.yellowstone_pgi.150917-151717/.test_build failed: 256

jedwards4b commented 9 years ago

The _M confopt expects an argument - it is the mpi-library specifier as in _Mmpi-serial.

On Thu, Sep 17, 2015 at 3:17 PM, Jay Shollenberger <notifications@github.com

wrote:

Looks like the _M confopt is broken in the current cime...

Setting up the following test:

testcase: PEA_P1_M grid: f09_g16 compset: TGHIST machine: yellowstone compiler: pgi Uncaught exception from user code: M option found but no MPILIB provided at /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/create_newcase line 701. at /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/create_newcase line 701 main::_set_confopts('_P1_M', 'ConfigCase=HASH(0x13aa7d0)') called at /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/create_newcase line 443

File specifying possible compsets: /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/components/cism/cime_config/config_compsets.xml Primary component (specifies possible compsets, pelayouts and pio settings): cism Compset: HIST_SATM_DLND%SCPL_SICE_SOCN_SROF_CISM1_SWAV Found machine "yellowstone" in /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/cime_config/cesm/machines/config_machines.xml confopts = _P1_M confopts pecount set to 1 invocation of create_newcase failed: create_newcase command was /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/create_newcase -silent -case /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/PEA_P1_M.f09_g16.TGHIST.yellowstone_pgi.150917-151717 -res f09_g16 -mach yellowstone -compset TGHIST -testname PEA -confopts _P1_M -mach_dir /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/cime_config/cesm/machines -compiler pgi Setting up tools for test case.. Use of uninitialized value $cimeroot in concatenation (.) or string at ./create_test line 1410. Use of uninitialized value $cimeroot in concatenation (.) or string at ./create_test line 1415. Use of uninitialized value $case in concatenation (.) or string at ./create_test line 1415. cp: cannot stat `/scripts/Testing/Testcases/tests_build.csh': No such file or directory cp -f /scripts/Testing/Testcases/tests_build.csh /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/PEA_P1_M.f09_g16.TGHIST.yellowstone_pgi.150917-151717/.test_build failed: 256

— Reply to this email directly or view it on GitHub https://github.com/CESM-Development/cime/issues/157#issuecomment-141230641 .

Jim Edwards

CESM Software Engineer National Center for Atmospheric Research Boulder, CO

billsacks commented 9 years ago

I can't speak to whether it was correct or not... but it used to be that all (I believe) PEA tests had that form: PEA_P1_M. Honestly, I'm not sure that I ever understood why the _M was there, but it was.

So, @jedwards4b : Do you suggest simply getting rid of the _M, or is some other change needed now to get the PEA test to work?

jedwards4b commented 9 years ago

Yes get rid of the _M, let me know if you figure out what it was meant for.

On Thu, Sep 17, 2015 at 8:52 PM, Bill Sacks notifications@github.com wrote:

I can't speak to whether it was correct or not... but it used to be that all (I believe) PEA tests had that form: PEA_P1_M. Honestly, I'm not sure that I ever understood why the _M was there, but it was.

So, @jedwards4b https://github.com/jedwards4b : Do you suggest simply getting rid of the _M, or is some other change needed now to get the PEA test to work?

— Reply to this email directly or view it on GitHub https://github.com/CESM-Development/cime/issues/157#issuecomment-141327498 .

Jim Edwards

CESM Software Engineer National Center for Atmospheric Research Boulder, CO

billsacks commented 9 years ago

I have removed the _M from the PEA tests added by CISM; with that change, the tests pass. That change does not solve the original issue in this bug report, though.

sholly commented 9 years ago

I just tried to create a SMS.f09_g16.TGHIST.yellowstone_pgi, still seeing this error:

ERROR preview_namelists: /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/components/cism/cime_config/buildnml /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/SMS.f09_g16.TGHIST.yellowstone_pgi.150921-160001 failed: 512

at /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/SMS.f09_g16.TGHIST.yellowstone_pgi.150921-160001/preview_namelists line 171 ERROR: /glade/p/work/jshollen/devsandboxes/cesm1_5_alpha01.issuefixes/cime/scripts/SMS.f09_g16.TGHIST.yellowstone_pgi.150921-160001/preview_namelists failed: 512 ./case_setup failed: 512

mvertens commented 9 years ago

The reason this is a problem is that we always need the glc grid right now for a compset that includes cism. This needs to be changed to SMS.f09_g16_gl5.TGHIST.yellowstone_pgi (Note the gl5 in the resolution)

mvertens commented 9 years ago

I do not think that this necessarily fixes Bill's problem - but it does address Jay's issue.

sholly commented 8 years ago

I was not able to reproduce the issue from cesm1_4_beta07, but the TGHIST test is now failing with unset inputdata, so I am assigning this back to you, Bill... ;) File status unknown: UNSET/UNSET.cpl.hs2x.0001-01-01.nc

billsacks commented 8 years ago

I cannot reproduce @sholly 's problem from the cesm1_5_alpha04 branch:

DONE PEA_P1_M.f09_g16.TGHIST.yellowstone_pgi : (test finished, successful coupler log) 
--- Test Functionality: ---
PASS PEA_P1_M.f09_g16.TGHIST.yellowstone_pgi.cism.h.nc : test compare cism.h (.base and .mpiserial files) 
PASS PEA_P1_M.f09_g16.TGHIST.yellowstone_pgi.cpl.hi.nc : test compare cpl.hi (.base and .mpiserial files) 
PASS PEA_P1_M.f09_g16.TGHIST.yellowstone_pgi : test functionality summary (compare .base and .mpiserial files) 
--- Test time is 191 seconds ---