Closed MohamadHarastani closed 3 years ago
Thanks @DStrelak , I will check it out asap
can you try if this works for you?
Hi @DStrelak , I checked it and it worked. Here is the corresponding output of xmipp compile:
mpirun -np 4 --oversubscribe echo ' > This sentence should be printed 4 times if mpi runs fine (by mpirun).'
> This sentence should be printed 4 times if mpi runs fine (by mpirun).
> This sentence should be printed 4 times if mpi runs fine (by mpirun).
> This sentence should be printed 4 times if mpi runs fine (by mpirun).
> This sentence should be printed 4 times if mpi runs fine (by mpirun).
These were the line of codes that I edited (starting from this line):
if checkProgram("mpirun",False):
ok=(runJob("mpirun -np 4 --oversubscribe echo '%s (by mpirun).'" % echoString) or
runJob("mpirun -np 4 --allow-run-as-root --oversubscribe echo '%s (by mpirun).'" % echoString))
elif checkProgram("mpiexec",False):
ok=(runJob("mpiexec -np 4 --oversubscribe echo '%s (by mpiexec).'" % echoString) or
runJob("mpiexec -np 4 --oversubscribe --allow-run-as-root echo '%s (by mpiexec).'" % echoString))
I can't verify if the flag --oversubscribe works with mpiexec I checked increasing 4 to 10 and it worked (to make sure its the flag --oversubscribe that is solving the issue as I have 4 mpi slots that could by chance be empty)
This is what I used to test:
[mohamad@localhost ~]$ which mpirun
/usr/lib64/openmpi3/bin/mpirun
[mohamad@localhost ~]$ mpirun --version
mpirun (Open MPI) 3.1.3
Regards
'--oversubscribe' has been added in MPI 2.1. and e.g. Travis uses version 1.6 @dmaluenda , do we / can we detect MPI version?
We can assert the MPI version by parsing 'mpirun --version'. However, we can add the '--oversubscribe' flag to the 'or' string in order to avoid a fail just due to the lack of that flag.
We can, alternativelly, put a 'mpi -np 2 ...' . 2 should be always fine, isn't it?
We can, alternativelly, put a 'mpi -np 2 ...' . 2 should be always fine, isn't it?
In theory, yes. I doubt that anybody would be brave enough to use xmipp with less than two cores. How about we link it with the number of jobs used for build?
I thought the same "Who want to run Xmipp with less than 4 cores?" But the answer is "What about the login node in clusters?" Damn!
I agree on linking the number of mpi-jobs to the number of cores for the compilation. Indeed, in the hypothetical case than N=1, if
mpirun -np 1 echo whatever
works, it's fine. We are checking that mpirun works and this prove that.
Can we close this issue?
I think yes. I forced to use only 2 cores, that is the minimum that makes sense...
If the problem persist, please, don't hesitate to reopen this to be able to make an more accurate approach.
No objection.. thank you both @dmaluenda @DStrelak
Hello again, I have just had another similar issue. I am trying to compile xmipp on a supercomputer. Usually, we have a script to run mpi jobs using sbatch. Here is what I get:
[uhj53dz@jean-zay4: xmipp]$ ./xmipp
'xmipp.conf' detected.
Checking configuration ------------------------------
Checking compiler configuration ...
g++ 8 detected
g++ -c -w -mtune=native -march=native -std=c++11 -O3 xmipp_test_main.cpp -o xmipp_test_main.o -I../ -I/gpfswork/rech/nvo/uhj53dz/miniconda3/include -I/gpfswork/rech/nvo/uhj53dz/miniconda3/include/python3.8 -I/gpfswork/rech/nvo/uhj53dz/miniconda3/lib/python3.8/site-packages/numpy/core/include
g++ -L/gpfswork/rech/nvo/uhj53dz/miniconda3/lib xmipp_test_main.o -o xmipp_test_main -lfftw3 -lfftw3_threads -lhdf5 -lhdf5_cpp -ltiff -ljpeg -lsqlite3 -lpthread
rm xmipp_test_main*
Checking MPI configuration ...
mpicxx -c -w -I../ -I/gpfswork/rech/nvo/uhj53dz/miniconda3/include -mtune=native -march=native -std=c++11 -O3 xmipp_mpi_test_main.cpp -o xmipp_mpi_test_main.o
mpicxx -L/gpfswork/rech/nvo/uhj53dz/miniconda3/lib xmipp_mpi_test_main.o -o xmipp_mpi_test_main -lfftw3 -lfftw3_threads -lhdf5 -lhdf5_cpp -ltiff -ljpeg -lsqlite3 -lpthread
rm xmipp_mpi_test_main*
mpirun -np 1 echo ' > This sentence should be printed 2 times if mpi runs fine.'
This version of Spack (openmpi ~legacylaunchers schedulers=slurm)
is installed without the mpiexec/mpirun commands to prevent
unintended performance issues. See https://github.com/spack/spack/pull/10340
for more details.
If you understand the potential consequences of a misconfigured mpirun, you can
use spack to install 'openmpi+legacylaunchers' to restore the executables.
Otherwise, use srun to launch your MPI executables.
mpirun -np 1 --allow-run-as-root echo ' > This sentence should be printed 2 times if mpi runs fine.'
This version of Spack (openmpi ~legacylaunchers schedulers=slurm)
is installed without the mpiexec/mpirun commands to prevent
unintended performance issues. See https://github.com/spack/spack/pull/10340
for more details.
If you understand the potential consequences of a misconfigured mpirun, you can
use spack to install 'openmpi+legacylaunchers' to restore the executables.
Otherwise, use srun to launch your MPI executables.
mpirun or mpiexec have failed.
Cannot compile with MPI or use it
rm xmipp_mpi_test_main*
rm: cannot remove 'xmipp_mpi_test_main*': No such file or directory
I will try to workaround this by commenting this test.. I will reply again here my progress.
Regards, Mohamad
Hi @MohamadHarastani , Thanks for reporting this problem. However, I don't think that there's anything we can / should do about this particular case (unless, of course, it turns out that it's a wide-spread issue). As you surely understand, we can't prepare our script for all possible environments. The admin of your machine will be able to resolve this problem. I'm however a bit worried how / if Scipion will work fine in that environment (as we should have the support of Slurm, but AFAIK it's not often used [read: well tested]). If in doubts, feel free to contact us, and we'll gladly help! KR, David
Thanks @DStrelak for your reply. I commented the lines that require mpirun and the installation continued. We have using the previous xmipp version on the same super computer (compatible with scipion2) and now I am trying to compile the new version. Of course, we don't need support for all environments or for slurm, we prepare our slurm scripts manually, all what we need is a successful xmipp compilation (it is just a linux redhat, compilation with conda). I have limited experience in slurm, but it is used on the two supercomputers that we have access to here. I passed this step now. Just for the record, I commented these lines from starting from here.
# if not (runJob("%s -np 2 echo '%s.'" % (configDict['MPI_RUN'], echoString)) or
# runJob("%s -np 2 --allow-run-as-root echo '%s.'" % (configDict['MPI_RUN'], echoString))):
# print(red("mpirun or mpiexec have failed."))
# return False
We can close this issue and rediscuss a solution if needed (maybe a flag to pass this mpirun test with an error message that shows the option to run with this flag).
Regards, Mohamad
Hi @MohamadHarastani , I'm glad that it was the only hurdle you've met. The flag to skip (a specific) config test sounds good to me. What do you think, @dmaluenda ?
I agree on a bypassing flag. I vote for an environ var like XMIPP_NOMPICHECK=True
or something like that. In this way we can add it to the https://github.com/I2PC/xmipp/wiki/Xmipp-configuration-(version-20.07) guide.
By the way, note that the whole config-checking can be skipped just by manually stepping the build
./xmipp config
./xmipp compileAndInstall
(note the mising ./xmipp checkConfig
in between)
By the way, note that the whole config-checking can be skipped just by manually stepping the build
./xmipp config ./xmipp compileAndInstall
(note the mising
./xmipp checkConfig
in between)
Thanks a lot for this hint. I don't think a flag is necessary in this case. I will try this option soon and comment on the result.
Should be resolved .
Hi, While compiling xmipp on a personal laptop, I faced an error as follows: "There are not enough slots available in the system to satisfy the 4 slots" I have exactly 4 mpi slots in the processor (Intel® Core™ i7-4500U CPU @ 1.80GHz × 4 ). This printing 4 times was only to print a sentence, but it ended up breaking the compilation. I fixed this issue by replacing '4' by '2' in all the lines 692 to 698 here: https://github.com/I2PC/xmipp/blob/devel/xmipp#L692 Couldn't we test if mpi runs during the installation another way? or turn the error into a warning?
Cheers Mohamad