As found by @pcolarco, occasionally with Intel MPI checkpoint writes will fail. Debugging by @bena-nasa found that the fix is an old favorite I_MPI_ADJUST_GATHERV=3 previously added back in #119.
This was removed on Milan nodes at NCCS as it didn't seem to be needed anymore. But it apparently still is at times.
As found by @pcolarco, occasionally with Intel MPI checkpoint writes will fail. Debugging by @bena-nasa found that the fix is an old favorite
I_MPI_ADJUST_GATHERV=3
previously added back in #119.This was removed on Milan nodes at NCCS as it didn't seem to be needed anymore. But it apparently still is at times.