jdblischak / smk-simple-slurm

A simple Snakemake profile for Slurm without --cluster-config
Creative Commons Zero v1.0 Universal
120 stars 14 forks source link
bioinformatics slurm snakemake snakemake-profile

Simple Slurm

A simple Snakemake profile for Slurm without --cluster-generic-*-cmd

The option --cluster-config is removed in snakemake>8.0.0, but it's still possible to set default and rule-specific resources for submitting jobs to a remote scheduler using a combination of --default-resources and the resources field in individual rules. This profile is a simplified alternative to the more comprehensive official Slurm profile for Snakemake. For more background, this blog post by Sichong Peng nicely explains this strategy for replacing --cluster-config.

[!WARNING] The Slurm profile and documentation in this repository have been updated to only support Snakemake versions >= 8.0.0. This is because Snakemake 8 completely overhauled how it submits jobs to external clusters, which broke this and all the other existing profiles. If you plan to continue to use Snakemake 7, you can find the Snakemake 7 version of the docs in the v7 branch of this repository.

Features

Limitations

Quick start

  1. Download the configuration file config.v8+.yaml to your Snakemake project. It has to be in a subdirectory, e.g. simple/

  2. Open it in your favorite text editor and replace all the placeholders surrounded in angle brackets (<>) with the options you use to submit jobs on your cluster

  3. You can override any of the defaults by adding a resources field to a rule, e.g.

    rule much_memory:
        resources:
            mem_mb=64000
  4. Invoke snakemake with the profile:

    snakemake --profile simple/

Customizations

See the directory examples/ for examples you can experiment with on your cluster.

A fixed argument to sbatch, e.g. --account

To pass an additional argument to sbatch that will be fixed across all job submissions, add it directly to the arguments passed to sbatch in the field cluster-generic-submit-cmd. For example, to specify an account to use for all job submissions, you can add the --account argument as shown below:

executor: cluster-generic
cluster-generic-submit-cmd:
  mkdir -p logs/{rule} &&
  sbatch
    --partition={resources.partition}
    --qos={resources.qos}
    --cpus-per-task={threads}
    --mem={resources.mem_mb}
    --job-name=smk-{rule}-{wildcards}
    --output=logs/{rule}/{rule}-{wildcards}-%j.out
    --account=myaccount

A variable argument to sbatch, e.g. --time

To pass an additional argument to sbatch that can vary across job submissions, add it to the arguments passed to sbatch in the field cluster, list a default value in the field default-resources, and update any rules that require a value different from the default.

For example, the config.v8+.yaml below sets a default time of 1 hour, and the example rule overrides this default for a total of 3 hours. Note that the quotes around the default time specification are required, even though you don't need quotes when specifying the default for either partition or qos.

executor: cluster-generic
cluster-generic-submit-cmd:
  mkdir -p logs/{rule} &&
  sbatch
    --partition={resources.partition}
    --qos={resources.qos}
    --cpus-per-task={threads}
    --mem={resources.mem_mb}
    --job-name=smk-{rule}-{wildcards}
    --output=logs/{rule}/{rule}-{wildcards}-%j.out
    --time={resources.time}
default-resources:
  - partition=<name-of-default-partition>
  - qos=<name-of-quality-of-service>
  - mem_mb=1000
  - time="01:00:00"
# A rule in Snakefile
rule more_time:
    resources:
        time = "03:00:00"

Note that sbatch accepts time defined using various formats. Above I used hours:minutes:seconds, but the simple slurm profile is agnostic to how you choose to configure this. It's a good idea to be consistent across rules, but it's not required. From Slurm 19.05.7:

A time limit of zero requests that no time limit be imposed. Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds".

Thus to instead use minutes, you could achieve the same effect as above with:

executor: cluster-generic
cluster-generic-submit-cmd:
  mkdir -p logs/{rule} &&
  sbatch
    --partition={resources.partition}
    --qos={resources.qos}
    --cpus-per-task={threads}
    --mem={resources.mem_mb}
    --job-name=smk-{rule}-{wildcards}
    --output=logs/{rule}/{rule}-{wildcards}-%j.out
    --time={resources.time}
default-resources:
  - partition=<name-of-default-partition>
  - qos=<name-of-quality-of-service>
  - mem_mb=1000
  - time=60
# A rule in Snakefile
rule more_time:
    resources:
        time = 180

See examples/time-integer/ and examples/time-string/ for examples you can play with. Note that specifying the time as a string requires a minimum Snakemake version of 5.15.0.

Using a cluster status script

By default, snakemake can monitor jobs submitted to slurm. I realized this when reading this detailed blog post, in which the author decided not to use the cluster-status.py script provided by the official Slurm profile. Thus if you don't find that your jobs are silently failing often, then there's no need to worry about this extra configuration step.

However, if you start to have jobs silently fail often, e.g. with status TIMEOUT for exceeding their time limit, then you can add a custom script to monitor the job status with the option --cluster-generic-status-cmd.

The directory extras/ contains multiple options for checking the status of the jobs. You can choose which one you'd like to use:

To use one of these status scripts:

  1. Download the script to your profile directory where config.yaml is located

  2. Make the script executable, e.g. chmod +x status-sacct.sh

  3. Add the field cluster-generic-status-cmd to your config.yaml, e.g. cluster-generic-status-cmd: status-sacct.sh

  4. Add the flag --parsable to your sbatch command (requires Slurm version 14.03.0rc1 or later)

Multiple clusters

It's possible for Slurm to submit jobs to multiple different clusters. Below is my advice on how to configure this. However, I've worked with multiple HPC clusters running Slurm, and have never encountered this situation. Thus I'd appreciate any contributions to improve the documentation below.

  1. If you have access to multiple clusters, but only need to submit jobs to the default cluster, then you shouldn't have to modify anything in this profile

  2. If you want to always submit your jobs to a cluster other than the default, or use multiple clusters, then pass the option --clusters to sbatch, e.g. to submit your jobs to either cluster "c1" or "c2"

    # config.v8+.yaml
    executor: cluster-generic
    cluster-generic-submit-cmd:
      mkdir -p logs/{rule} &&
      sbatch
        --clusters=c1,c2
  3. To set a default cluster and override it for specific rules, use --default-resources. For example, to run on "c1" by default but "c2" for a specific rule:

    # config.v8+.yaml
    executor: cluster-generic
    cluster-generic-submit-cmd:
      mkdir -p logs/{rule} &&
      sbatch
        --clusters={resources.clusters}
    default-resources:
      - clusters=c1
    # Snakefile
    rule different_cluster:
        resources:
            clusters="c2"
  4. Using a custom cluster status script in a multi-cluster setup requires Snakemake 7.1.1+ (or Snakemake 8.0.0+ if you are using the new --cluster-generic-*-cmd flags). After you add the flag --parsable to sbatch, it will return jobid;cluster_name. I adapted status-sacct.sh to handle this situation. Please see examples/multi-cluster/ to try out status-sacct-multi.sh

Use speed with caution

A big benefit of the simplicity of this profile is the speed in which jobs can be submitted and their statuses checked. The official Slurm profile for Snakemake provides a lot of extra fine-grained control, but this is all defined in Python scripts, which then have to be invoked for each job submission and status check. I needed this speed for a pipeline that had an aggregation rule that needed to be run tens of thousands of times, and the run time for each job was under 10 seconds. In this situation, the job submission rate and status check rate were huge bottlenecks.

However, you should use this speed with caution! On a shared HPC cluster, many users are making requests to the Slurm scheduler. If too many requests are made at once, the performance will suffer for all users. If the rules in your Snakemake pipeline take at least more than a few minutes to complete, then it's overkill to constantly check the status of multiple jobs in a single second. In other words, only increase max-jobs-per-second and/or max-status-checks-per-second if either the submission rate or status checks to confirm job completion are clear bottlenecks.

License

This is all boiler plate code. Please feel free to use it for whatever purpose you like. No need to attribute or cite this repo, but of course it comes with no warranties. To make it official, it's released under the CC0 license. See LICENSE for details.