snakemake / snakemake-executor-plugin-slurm

A Snakemake executor plugin for submitting jobs to a SLURM cluster
MIT License
18 stars 19 forks source link

Best practice: a short-running rule called many times #115

Open bdelepine opened 4 months ago

bdelepine commented 4 months ago

Hi all,

This is a basic question but I would be glad to hear your thoughts on it: what is the best practice to design a short-running rule that will be used to spawn many jobs (using snakemake in a SLURM context, of course). I would define "short-running" as inferior to 3min, and "many jobs" as thousands of calls.

Without Snakemake, I would have used SLURM job arrays and a wrapper script to get batches of ~1h running jobs. My assumption is that it is best to give SLURM big-enough chunks so that we do not stress it too much with many jobs (and remain below the max number of jobs limit), but also small-enough chunks so that the scheduler is more likely to give us resources (and allocate them fairly among users).

With Snakemake and the slurm plugin, I would like to avoid writing a wrapper script, so:

Are my assumptions correct? What do you usually do to deal with short-running rules called many times?

Thanks!