Closed leke-lyu closed 2 years ago
Here I also attach my bash script to submit job on the slurm system:
cd $SLURM_SUBMIT_DIR module load ParGenes/20220329-foss-2020b-Python-3.8.6-Java-1.8 python /apps/eb/ParGenes/20220329-foss-2020b-Python-3.8.6-Java-1.8/pargenes/pargenes-hpc.py -a output -o pargenes_output -r raxml_options.txt --seed 3000 -s 0 -p 10 -b 0 -d nt -c 10
Hope u could help!
Dear leke-lyu,
Apparently there is a problem with the number of cores/slots you request with the -c option. See line 28 in the report file, this error message comes from MPI:
There are not enough slots available in the system to satisfy the 10
slots that were requested by the application:
Could it be that your submission script does not allocate enough slots? I am not very experienced with slurm, so I can't tell what could be wrong in yours. But the submission scripts I use look like this:
#SBATCH -B 2:8:1
#SBATCH -N 32 # because our cluster has 16 cores per node, and 512/16=32
#SBATCH -n 512
#SBATCH --threads-per-core=1
#SBATCH --cpus-per-task=1
#SBATCH --hint=compute_bound
#SBATCH -t 24:00:00
I would not copy paste it because all clusters have different configurations, but maybe this helps a bit.
If you want to make sure that the problem is the script (and not ParGenes), you can replace your ParGenes call with:
mpiexec -np 10 echo "hello"
This should print hello 10 times if the script is correct. But I would expect the same error message as the one you got in the report.
Let me know if this helps ;-) Benoit
Thank you Benoit, The issue has been solved!
Great, thanks for the feedback ;)
I launch the run on cluster, and I got this:
ParGenes report file for run pargenes_output
[REPORT] MainLogs
########################
PARGENES v1.2.0
########################
ParGenes was called as follow: /apps/eb/ParGenes/20220329-foss-2020b-Python-3.8.6-Java-1.8/pargenes/pargenes-hpc.py -a output -o pargenes_output -r raxml_options.txt --seed 3000 -s 0 -p 10 -b 0 -d nt -c 10 --scheduler split
[0:00:00] end of MSAs initializations Calling mpi-scheduler: mpiexec -n 10 /apps/eb/ParGenes/20220329-foss-2020b-Python-3.8.6-Java-1.8/pargenes/pargenes_src/../pargenes_binaries/mpi-scheduler --split-scheduler 10 /apps/eb/ParGenes/20220329-foss-2020b-Python-3.8.6-Java-1.8/pargenes/pargenes_src/../pargenes_binaries/raxml-ng-mpi.so pargenes_output/parse_run/parse_command.txt pargenes_output/parse_run Logs will be redirected to pargenes_output/parse_run/logs.txt [Error] [0:00:00] mpi-scheduler execution failed with error code 1 [Error] [0:00:00] Will now exit... [Error] <class 'RuntimeError'> mpi-scheduler execution failed with error code 1 Writing report file in /home/ll22780/tipTraitAssociation/covid19_cme_analysis-master/myData/pargenes_output/report.txt When reporting the issue, please always send us this file.
report.txt