MetOffice / opsinputs

JEDI library generating VarObs and Cx files
BSD 3-Clause "New" or "Revised" License
4 stars 0 forks source link

Transient failure in test_lfriclite_hofx_atms_opsinputs #170

Closed DJDavies2 closed 1 year ago

DJDavies2 commented 1 year ago

This on the Cray (cray_gnu):

14: ATMS: save database to DataOut/atms_hofx_20160101_sub86300_sub100.nc4 (io pool size: 4) 14: Rank 0 [Fri May 5 12:00:05 2023] [c2-0c1s7n1] Fatal error in MPIDI_Cray_shared_mem_coll_tree_reduce: Other MPI error, error stack: 14: MPIDI_Cray_shared_mem_coll_tree_reduce(177): message sizes do not match across processes in the collective routine: Received 8 but expected 192 14: 14: =================================================================================== 14: = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES 14: = PID 3002 RUNNING AT nid00477 14: = EXIT CODE: 134 14: = CLEANING UP REMAINING PROCESSES 14: = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES 14: =================================================================================== 14: YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6) 14: This typically refers to a problem with your application. 14: Please see the FAQ page for debugging suggestions

Usually works on a re-trigger.

DJDavies2 commented 1 year ago

Sorry, I got confused and opened it in the wrong repository.