Closed bartgol closed 3 years ago
Ok, I went on mappy and manually checked the test logs for PR #974 , which is marked as "passsed". Inside ctest's log, I can see this:
"shoc_stand_alone_ut_np1_omp1" start time: Apr 05 09:16 MDT
Output:
----------------------------------------------------------
ExecSpace name: OpenMP
ExecSpace initialized: yes
avx -AVX2-AVX compiler GCC default FPE mask 0 (NONE)
#host threads 1
sizeof(Real) = 8
default pack size = 16
-49 PIO ERROR: could not find variable T_mid in file shoc_standalone_ne4_ne4.nc
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 0.
Clearly, the error is happening. However, I suspect we call MPI_Abort with the wrong error code (0!!). When 0 is then returned to the caller (ctest) it is interpreted as a no-error sign.
This is causing master failures. I'm not sure how it got past the AT.
See comment at bottom of #973 for details.