Snakemake-Profiles / slurm

Cookiecutter for snakemake slurm profile
MIT License
126 stars 44 forks source link

Not quite understanding how to use this #46

Closed proteins247 closed 4 years ago

proteins247 commented 4 years ago

I have read through the README and installed this profile, but I still don't quite understand how it works. I am definitely a novice to snakemake, and I've gone through the snakemake tutorial. A key sticking point right now is running a workflow on my SLURM-managed HPC cluster. Particularly, I don't understand how to configure this profile, and the description in the README.md is not clear to me.

I think I understand how to run a workflow on SLURM using cluster-config. The NIH cluster has a webpage that provides examples as well: snakemake. I see that cluster-config deprecated however.

When I installed the profile, I did not configure any of the cookiecutter options, not fully understanding them at the time. I can see that config.yaml, cluster: "slurm-submit.py" is analogous to the --cluster commandline argument for snakemake. I can see that the purpose of slurm-submit.py is to generate the final sbatch command and call it. I am not sure what slurm-jobscript.sh does however.

Is the user supposed to edit config.yaml with the particulars of the user's cluster such as partition and time limit? It seems to me that config.yaml should not be edited (since it serves to connect the various parts of this profile, and inadvertent editing could break it). So, should I should create a separate file (either json or yaml) that possibly should be specified by CLUSTER_CONFIG in slurm-submit.py?

I think, for my first production workflow, I'd like a fine control over how each step is submitted to the cluster, just so I know what's going on. That basically means I would have a workdir-specific cluster.json file that I specify via cluster-config?

It's possible I'm overcomplicating things and worrying unnecessarily about cluster-config being deprecated. Am I right in my understanding that I could do snakemake --profile slurm --cluster-config my_job_config.json, and things should work?

percyfal commented 4 years ago

Hi,

upon installation you configured four options (as defined in cookiecutter.json):

1) profile name (i.e. directory where profile is installed) 2) sbatch_defaults - typically the account name and log file locations, e.g. "--acount your_project_account --output logs/slurm-%j.out --error logs/slurm-%j.err" 3) cluster_config - path to cluster configuration file; as you noted, this is deprecated in snakemake, but retained here as it provides a way to fine-tune configuration on a rule-by-rule basis (e.g. runtime, constraints) 4) advanced_argument_conversion - you can safely ignore this for now and set to no

Without profile, you would submit cluster jobs as

snakemake --cluster "sbatch --account account --output logs/slurm-%j.out --error logs/slurm-%j.err -t 12:00:00 ..." -j 1 jobname --cluster-config cluster-config

With profile, you would save typing, doing snakemake --profile profile_name -j 1 jobname

where the sbatch_defaults option above would be passed to the sbatch call, and the cluster configuration file set in the cluster_config option would be used. In addition, the profile uses the slurm_status.py script to check for job status, the main benefit being that jobs that timeout will be caught as failed, something that does not happen when you submit without a profile. Also, if you in your rule add resources such as "runtime", these will be parsed and added to the sbatch call.

As of snakemake 5.15, resources can now be strings, which means cluster configuration files could be eliminated entirely, although I haven't had time to look into this yet.

I hope this helps. If there is something I can do to improve the README, please let me know.

Cheers,

Per

proteins247 commented 4 years ago

That is useful. Thank you. I guess what I needed was, as you have just given, some example of what to do and recommended practice of where to put what.

The aspect of cluster-config being deprecated is confusing since it appears to still be useful. Based on what I can understand from this issue, https://github.com/snakemake/snakemake/issues/248, which has been referenced here as well, that cluster-config option does not have a clear replacement.

I did find another webpage that describes how to use profiles: https://www.sichong.site/2020/02/25/snakemake-and-slurm-how-to-manage-workflow-with-resource-constraint-on-hpc/. I've linked it in case it's useful to other readers.

proteins247 commented 4 years ago

I just want to report that a hybrid approach currently works best for me:

snakemake --profile slurm --cluster-config config.yaml

Within config.yaml, I can define slurm parameters (e.g. partition, mem-per-cpu) that I prefer for particular steps of my workflow.

Maybe use of resources can eliminate use of --cluster-config, but that'll wait until later for me.

jdblischak commented 3 years ago

Maybe use of resources can eliminate use of --cluster-config, but that'll wait until later for me.

@proteins247 I was able to replace the features of --cluster-config with a combination of resources and default-resources. Check out https://github.com/jdblischak/smk-simple-slurm

jdblischak commented 3 years ago

I did find another webpage that describes how to use profiles: https://www.sichong.site/2020/02/25/snakemake-and-slurm-how-to-manage-workflow-with-resource-constraint-on-hpc/. I've linked it in case it's useful to other readers.

I agree this is a super helpful blog post that demonstrates how to use resources and default-resources to replace --cluster-config for numeric resources. It was written in Feb 2020, when Snakemake resources could only be numeric. A few months later in April 2020, version 5.15.0 enabled specifying resources as strings (changelog). Thus now you can specify that specific rules get run on a particular partition or with a different quality of service.