Closed bforsbe closed 7 years ago
Original comment by Bjoern Forsberg (Bitbucket: bforsbe, GitHub: bforsbe):
Also, when using relion_refine_mpi, you should specify the number of working "slave" ranks, and add one rank to act as "master", as always. Since I get the impression you want to use both GPUs, you should use
#!bash
mpirun -n 3 relion_refine_mpi --gpu
and avoid specifying device-selection syntax like --gpu 0, which will tell the first rank to use only GPU with index 0.
You probably just tried a single slave as a part of your trials to find the cause of this unwanted behavior, but I still want to emphasize that running mpirun -n 2 is effectively the same as not using MPI at all, and should be avoided. While you may hide some latency, you incur communication and memory penalties.
This is not related to the issue you are seeing, but is worth noting.
Originally reported by: AndreHeuer (Bitbucket: Xenoprime, GitHub: Xenoprime)
We have an older graphics card in a workstation which should still be able to do a good job for non-titan data-sets.
Card:
Despite no problems during build we were unable to run any GPU jobs from RELION .
Problem:
Note:
We tried to corner the problem - with no positive result:
Question:
Suggestion:
More detailed description of where the GPU job hangs / system:
Top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 13901 heuer 20 0 327m 22m 7856 R 99.9 0.0 0:41.34 relion_refine_mpi --o Class2D/j.. 13903 heuer 20 0 72.3g 18m 9224 R 99.9 0.0 0:41.34 relion_refine_mpi --o Class2D/j.. 13902 heuer 20 0 72.3g 20m 9312 R 99.5 0.0 0:41.31 relion_refine_mpi --o Class2D/j..
Simple GPU info query:
You have 2 nVidia GPGPU.