snakemake / snakemake-executor-plugin-slurm

A Snakemake executor plugin for submitting jobs to a SLURM cluster
MIT License
9 stars 13 forks source link

Missing support for --clusters option #53

Open meliamne opened 3 months ago

meliamne commented 3 months ago

Hi,

Our hpc runs 2 slurm clusters, this requires us to provide the -M or --cluster option to most slurm related commands like sbatch, squeue and scancel. This option is unsupported in snakemake 8+ as far as i can tell. In older snakemake versions support existed through profiles.

I can get submission to work by specifying --cluster in slurm_extra. However, snakemake is unable to detect when a job finishes and start the next set of rules.

slurm: 23.02.7 snakemake: 8.9.0 snakemake-executor-plugin-slurm: 0.4.2

cmeesters commented 3 months ago

Actually, the idea is: provide the most-common flags, wrap everything else in slurm_extra (documentation is forthcoming).

However, I will discuss adding cluster and gres now.

meliamne commented 3 months ago

Thank you very much for looking into this. I think the current approach generally makes a lot of sense. The main issue in our HPC's case is that, as far as i can tell, the --cluster option from slurm_extra is only applied when submitting jobs. However, it is not applied when checking the status of a job with sacct or canceling a job with scancel. In our environment running these commands without the --cluster option specified returns nothing. Thus after starting the first round of jobs, snakemake gets stuck trying to check the status of the jobs.

cmeesters commented 3 months ago

Please check, whether PR #56 fixes your issue: I do not have access to a multicluster setup. That might change in the future, but for now, I cannot test this feature.

And you are right: Checking the job status and cancelling jobs, has to be done be the executor plugin. If the information is only available at submit time, this would not be sufficient.