Closed katkopera closed 1 year ago
you are supposed to write a bash script that contains the slurm utility commands that specify the resources (time/memory/number of cores/parallelization/etc.) as well as the commands run using sbatch. And then you can run that bash file using sbatch FILENAME.sh
to execute the pipeline in slurm.
An example of a working slurm bash script:
#! /bin/bash -l
#SBATCH --partition=panda # cluster-specific
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --job-name=slurm_qc
#SBATCH --time=00:08:00 # HH/MM/SS
#SBATCH --mem=50G # memory requested, units available: K,M,G,T
#SBATCH --output slurm_qc-%j.out
#SBATCH --error slurm_qc-%j.err
source ~/.bashrc
mamba activate short-read-quality-control
echo "This is job #:" $SLURM_JOB_ID >> slurm_qc_output.txt
echo "conda activated?"
python /home/chf4012/camp_short-read-quality-control/workflow/short-read-quality-control.py -c 5 -d /home/chf4012/camp_short-read-quality-control/test_data_tadpole_3 -s /home/chf4012/camp_short-read-quality-control/test_data_tadpole_3/samples.csv
exit
Hopefully this addresses your question. Let me know if you have more questions.
Yes, I know I'm able to run it this way. My point was that according to the documentation site I can try running the python script with '--slurm' flag, which gives the above error. If it shouldn't be run with '--slurm' consider updating the documentation. Thanks anyway!
makes sense, let me check in with the team about it
Your observations are correct, though the --slurm
flag was described as it was in the README and as Tom described it intentionally. Most cluster access nodes have a limited number of threads, and persistent commands like Snakemake running on them for hours or days tie up the already small thread pool. As a preventative measure, the --slurm
flag will only function properly when the command is submitted as a bash script or within a bash script. If this is not an issue for you, please submit a feature pull request with a new cluster submission flag and we'll review it and get back to you. Thanks!
Hi, I'm testing CAMP on a job submission cluster with SLURM ticketing system (https://www.top500.org/system/179958/). I follow CAMP Short-Read Quality Control README.md instructions for cluster submission and execute short-read-quality-control.py with '--slurm' flag. Snakemake won't complete any job because of:
sbatch: error: Batch job submission failed: Access/permission denied
This is not surprising as the cluster won't allow job submission from a compute host. Compute host doesn't have access to any workload manager, only the utilities specified in the initial sbatch `sbatch -J jobname -o jobname.log << "EOF"
!/bin/bash
python /path/to/camp_short-read-quality-control/workflow/short-read-quality-control.py --slurm \ (-c max_number_of_parallel_jobs_submitted) \ -d /path/to/work/dir \ -s /path/to/samples.csv EOF` As I understand from the code you try to run slurm job inside slurm job (correct me if I'm wrong).
I guess I won't be the only user that have this problem, so could you guys address my problem somehow?
P.S. I'm able to execute the script when I omit the '--slurm' flag.