Closed Cygnus2015 closed 9 months ago
Please use the issue template! Without information on your system, we cannot answer.
It appears that your queue system and/or MPI runtime is not set up properly. Perhaps you are using a different MPI runtime from what you used to compile RELION.
I updated it using the issue template. Thanks.
What does which mpirun
say?
Does it show the same MPI implementation as the one you used for compilation?
it said /home/lab/anaconda3/bin/mpirun
This is suspicious. It is from conda. Check your cmake
log file (if you don't have it, recompile RELION from scratch and check what it says). Probably you used a system-wide OpenMPI (e.g. /usr/bin/mpicc) for compilation but are using OpenMPI runtime from conda. This mismatch is the cause.
Thanks for suggesting it. I will check this.
I was checking CMakeOutput.log file in /home/lab/relion/build/CMakeFiles. As you mentioned, it was using a system-wide OpenMPI.
I would appreciate it if you could suggest anything.
Thanks.
Run RELION without activating any environment of conda (including base
).
No response for many months. Closing.
Hi All,
I had this issue during 3D auto refine. The setting was using the number of MPI proc = 3 and threads=2 for the 3D auto-refine. Any suggestions would be appreciated.
Thanks. Park
Environment:
Dataset:
Job options:
which relion_refine_mpi
--o Refine3D/job085/run --auto_refine --split_random_halves --i Extract/job056/particles.star --ref Class3D/job049/run_it025_class001_256px.mrc --firstiter_cc --ini_high 60 --dont_combine_weights_via_disc --pool 3 --pad 2 --auto_ignore_angles --auto_resol_angles --ctf --particle_diameter 200 --flatten_solvent --zero_mask --oversampling 1 --healpix_order 2 --auto_local_healpix_order 4 --offset_range 5 --offset_step 2 --sym O --low_resol_join_halves 40 --norm --scale --j 2 --gpu "" --pipeline_control Refine3D/job085/Error message:
================== ERROR: MlOptimiserMpi::initialiseWorkLoad: at least 3 MPI processes are required when splitting data into random halves
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them.
in: /home/hoanglab/relion/src/ml_optimiser_mpi.cpp, line 539 ERROR: MlOptimiserMpi::initialiseWorkLoad: at least 3 MPI processes are required when splitting data into random halves === Backtrace === /usr/local/bin/relion_refine_mpi(_ZN11RelionErrorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_l+0x7d) [0x55fb830b785d] /usr/local/bin/relion_refine_mpi(+0x4b138) [0x55fb83046138] /usr/local/bin/relion_refine_mpi(_ZN14MlOptimiserMpi10initialiseEv+0xac) [0x55fb830ea62c] /usr/local/bin/relion_refine_mpi(main+0x71) [0x55fb830a2c91] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7fd7093a20b3] /usr/local/bin/relion_refine_mpi(_start+0x2e) [0x55fb830a5fbe]
ERROR: MlOptimiserMpi::initialiseWorkLoad: at least 3 MPI processes are required when splitting data into random halves
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them.
in: /home/hoanglab/relion/src/ml_optimiser_mpi.cpp, line 539 ERROR: MlOptimiserMpi::initialiseWorkLoad: at least 3 MPI processes are required when splitting data into random halves === Backtrace === /usr/local/bin/relion_refine_mpi(_ZN11RelionErrorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_l+0x7d) [0x55fdfcbf285d] /usr/local/bin/relion_refine_mpi(+0x4b138) [0x55fdfcb81138] /usr/local/bin/relion_refine_mpi(_ZN14MlOptimiserMpi10initialiseEv+0xac) [0x55fdfcc2562c] /usr/local/bin/relion_refine_mpi(main+0x71) [0x55fdfcbddc91] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f31df8c40b3] /usr/local/bin/relion_refine_mpi(_start+0x2e) [0x55fdfcbe0fbe]
ERROR: MlOptimiserMpi::initialiseWorkLoad: at least 3 MPI processes are required when splitting data into random halves
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them.
=== RELION MPI setup ===
Leader (0) runs on host = Cryo1
Running CPU instructions in double precision. Running CPU instructions in double precision. Running CPU instructions in double precision.
RELION version: 4.0-beta-2-commit-ce2e93 exiting with an error ...
RELION version: 4.0-beta-2-commit-ce2e93 exiting with an error ...
RELION version: 4.0-beta-2-commit-ce2e93 exiting with an error ...