This is a basic question but I would be glad to hear your thoughts on it: what is the best practice to design a short-running rule that will be used to spawn many jobs (using snakemake in a SLURM context, of course). I would define "short-running" as inferior to 3min, and "many jobs" as thousands of calls.
Without Snakemake, I would have used SLURM job arrays and a wrapper script to get batches of ~1h running jobs. My assumption is that it is best to give SLURM big-enough chunks so that we do not stress it too much with many jobs (and remain below the max number of jobs limit), but also small-enough chunks so that the scheduler is more likely to give us resources (and allocate them fairly among users).
With Snakemake and the slurm plugin, I would like to avoid writing a wrapper script, so:
I may write rules to split/gather the batches much like the built-in scatter-gather feature. This works, but it's kind of like writing a wrapping script.
I may use group and group-components like in https://github.com/snakemake/snakemake/issues/872. This also works, but I find it kind of cumbersome to parametrize resources (ex: if I want to design ~1h batches out of a rule that typically takes ~2min, I must first set cores to this rule cpus_per_task to make sure calls will be in series, then group-components to 30 (=60/2); but as cores is set for all groups, it gets more complex if I must design "batches" for several rules).
Are my assumptions correct? What do you usually do to deal with short-running rules called many times?
Hi all,
This is a basic question but I would be glad to hear your thoughts on it: what is the best practice to design a short-running rule that will be used to spawn many jobs (using snakemake in a SLURM context, of course). I would define "short-running" as inferior to 3min, and "many jobs" as thousands of calls.
Without Snakemake, I would have used SLURM job arrays and a wrapper script to get batches of ~1h running jobs. My assumption is that it is best to give SLURM big-enough chunks so that we do not stress it too much with many jobs (and remain below the max number of jobs limit), but also small-enough chunks so that the scheduler is more likely to give us resources (and allocate them fairly among users).
With Snakemake and the slurm plugin, I would like to avoid writing a wrapper script, so:
cores
to this rule cpus_per_task to make sure calls will be in series, then group-components to 30 (=60/2); but ascores
is set for all groups, it gets more complex if I must design "batches" for several rules).Are my assumptions correct? What do you usually do to deal with short-running rules called many times?
Thanks!