Closed slevis-lmwg closed 1 year ago
izumi test-suite PASS except the pgi test doesn't build. I heard that pgi would be removed soon and maybe that has now happened?
cheyenne test-suite PASS
As expected, the two izumi tests from #42 continue to fail, but they point to an error in the first timestep here:
cesm.exe 0000000000A0E82D mml_mainmod_mp_ph 2519 mml_main.F90
So I will pursue this further on izumi (rather than on cheyenne where the new error shows up as an mpt error in timestep 1490 of case2).
The two izumi tests from #42 continue to fail, but latest error says:
[cli_54]: aborting job:
Fatal error in MPI_Irecv: Invalid datatype, error stack:
MPI_Irecv(153): MPI_Irecv(buf=0x7fe305fc7e20, count=1, INVALID DATATYPE, src=49, tag=241, comm=0xc4000012, request=0x7ffe6b153c40) failed
MPI_Irecv(103): Invalid datatype
[mpiexec@i042.cgd.ucar.edu] HYDT_bscd_pbs_wait_for_completion (tools/bootstrap/external/pbs_wait.c:67): tm_poll(obit_event) failed with TM error 17002
[mpiexec@i042.cgd.ucar.edu] HYDT_bsci_wait_for_completion (tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion
I tried one of the two as a nag
instead of intel
test and it passed!
I will probaby stop pursuing this error any further once I confirm that my mods haven't broken anything else.
@ekluzek I have confirmed on izumi that my mods haven't changed answers. I'm running the cheyenne test-suite right now.
Meanwhile the tests that were failing now stop at one of the new assert statements. I would like to discuss with you how to pursue this further or whether to table it for now. May I send and invite for a quick chat?
@ekluzek recommended a code change that got one of the failing tests to PASS on cheyenne.
@slevisconsulting will rerun test-suites. @ekluzek is concerned that this will change answers.
Good news: Cheyenne test-suite: PASS (no diffs from baseline). There's only one failing test left on cheyenne, and it's listed in the expected failures and in #17.
Izumi test-suite PASS, except pgi has stopped working in the last couple of weeks; I assume this means that the pgi compiler has been removed.
@ekluzek if you are also comfortable with this, I can go ahead and merge/tag.
@slevisconsulting yes PGI has been removed. So go ahead and merge this. You might as well remove the PGI test that no longer works. PGI is gone at this point. Going forward we should use the nvhpc compiler which is the successor to PGI (PGI was bought out by NVIDIA).
Relates to issue #42 which identifies 2 cheyenne tests and their 2 izumi equivalents as failing. The issue discusses:
ERP_D_Ld60.f19_g17.I1850Slim50RsGs.cheyenne_intel.clm-realistic_fromCLM5_1850Monthly
ERS_D_Ld60.f19_g16.H_MML_2000_CAM5.cheyenne_gnu.clm-global_uniform_g16_SOM
I will run full test-suites to confirm that answers remain unchanged for all the tests. Then this PR will be ready for merging.