Snakemake-Profiles / slurm

Cookiecutter for snakemake slurm profile
MIT License
126 stars 44 forks source link

How to specify slurm error/output paths via profile #29

Closed johnstonmj closed 4 years ago

johnstonmj commented 4 years ago

Hi,

With cluster configuration being deprecated, I am trying to switch over to profiles.

Previously my snakemake command was a lengthy: snakemake -nq --jobs 100 --keep-going --restart-times 3 --use-conda --cluster-config cluster.yaml --cluster "bsub -J {cluster.jobname} -n {cluster.numcpu} -R {cluster.span} -R {cluster.memory} -M {cluster.maxmem} -We {cluster.wall_est} -W {cluster.wall_max} -o {cluster.output} -e {cluster.error} -m {cluster.host} < " all

Referring to a cluster.yaml:

__default__ :
    queue     : normal
    numcpu    : "{threads}"
    memory    : "\"rusage[mem=4000]\""
    span      : "\"span[hosts=1]\""
    maxmem    : 250000
    jobname   : "{rule}.{wildcards}"
    wall_est  : 2:00
    wall_max  : 24:00
    output    : "logs/{rule}.{wildcards}.out"
    error     : "logs/{rule}.{wildcards}.err"
some_rule :
    memory    : "\"rusage[mem=64000]\""
    wall_est  : 48:00
    wall_max  : 168:00

I like that the ~/.config/snakemake/slurm/config.yaml profile allows me to collapse: --jobs 100 --keep-going --restart-times 3 to --profile slurm

But how do I address the per-rule resource requirements as --cluster-config did?

When I attempt to specify these per-rule requirements as 'resources':

rule:
    input:     ...
    output:    ...
    resources:
        mem_mb=250000
        output="logs/{rule}.{wildcards}.out"
    shell:
        "..."

This fails as only resources are only allowed to be integers.

I tried modifying slurm-jobscript.sh to be:

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --output="logs/{rule}.{wildcards}.out"
#SBATCH --error="logs/{rule}.{wildcards}.err"
# properties = {properties}
{exec_job}

But this failed because 'rule' was not defined here.

How can I specify strings to be passed to the downstream sbatch command on a per-rule basis (for error logs, partition names, etc)?

Specifying these cluster requirements as 'resources' seems most elegant to me if it would work, but so far I have only had success with a cluster-config file and a lengthy cluster statement at the command line.

Any help would be greatly appreciated! Thanks

percyfal commented 4 years ago

Hi,

I agree there may be situations where one would want to set non-integer values in resources, e.g. for specifying partitions. I'm not too familiar with the implementation though or whether it would be easy to add.

Regarding your output file names, couldn't you use slurm wildcards, e.g. logs/slurm-%x-%j.out, which would print the jobname and job number?

percyfal commented 4 years ago

Closing this issue in part due to PR #33. Feel free to reopen again.