In some implementations of MPI-IO, MPI_File_open internally calls srand() on the master rank, and we can not assume that the rand() on each MPI process returns the same value. This causes a deadlock in RHMC, where check of the range of eigenvalues is triggered by using rand().
A workaround is to pass
--mca io ompio
to mpiexec.
We can also modify Grid/qcd/action/pseudofermion/OneFlavourEvenOddRationalRatio.h etc. not to use rand() but to use rand_r() (or broadcast the result of rand() from the master rank).
In some implementations of MPI-IO, MPI_File_open internally calls srand() on the master rank, and we can not assume that the rand() on each MPI process returns the same value. This causes a deadlock in RHMC, where check of the range of eigenvalues is triggered by using rand().
It seems roimo123 is responsible to this behaviour.
https://github.com/open-mpi/ompi/blob/master/ompi/mca/io/romio321/romio/adio/common/shfp_fname.c#L32
A workaround is to pass
--mca io ompio
to mpiexec. We can also modify Grid/qcd/action/pseudofermion/OneFlavourEvenOddRationalRatio.h etc. not to use rand() but to use rand_r() (or broadcast the result of rand() from the master rank).