snakemake / snakemake-executor-plugin-slurm

A Snakemake executor plugin for submitting jobs to a SLURM cluster
MIT License
9 stars 13 forks source link

Interpolation of wildcards in profile slurm configuration #11

Closed TBradley27 closed 5 months ago

TBradley27 commented 6 months ago

Hello,

The documentation for the slurm executor plugin explains that the 'slum_extra' resource can be used to configure options not included in the standard resource definitions (https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/slurm.html#additional-custom-job-configuration).

However, it can be cumbersome to manually use this extra resource definition for each individual rule, and it is therefore more convenient to specify this option using a profile. However, as far as I am aware the slurm_extra option does not allow the interpolation of any snakemake placeholder values such as rule names or wildcard names.

In snakemake v<8.0. this type of interpolation could be achieved like so within a configuration profile:

cluster: "sbatch -A jblab -p rocm -t {resources.time_min} --mem={resources.mem_mb} -c {resources.cpus} -o results/logs_slurm/{rule}_{wildcards} -e results/logs_slurm/{rule}_{wildcards}"

This previous logging set-up was very useful as users could easily identify problematic rule-wildcard combinations which failed execution, and more easily troubleshoot problems. The previous --cluster directive in is no longer available to use for snakemake 8.0

However, I don't think a similar type of interpolation is possible using the current slurm executor (using snakemake --executor slurm --profile my_profile) as use of double quotes are not permitted and raises an error:

SLURM job submission failed. The error message was sbatch: error: Script arguments not permitted with --wrap option

Is there any way to replicate this functionality with respect to snakemake 8.0's API?

cmeesters commented 6 months ago

While resources are configurable per rule in a profile-file (and I think I will try to make the docs more concise in this regard), things like -o results/logs_slurm/{rule}_{wildcards} are taken over by the executor plugin. At the time of writing, I opted for something like the rule name in the jobname and log file, accordingly. I was outvoted.

The idea of gathering logs in a centralized space bear the idea of introducing a (configurable) deletion time to avoid gathering too many files (which can be an issue on some systems), at some point. Again, I think introducing more than just the jobid into the file name would be beneficial.

I will discuss this and post the conclusion, here.

TBradley27 commented 6 months ago

Thank you very much for your comment

Is there any way for users to customise this themselves? I tried using the slurm_extra option and manipulate the -o and -e options to SLURM, but I didn't get very far for the reasons stated above

cmeesters commented 6 months ago

I think a few things need clarification:

Please tell us, which features are missing, respectively nice to have.

TBradley27 commented 5 months ago

Thank you for the clarification

I think part of the problem is that I was previously relying on the cluster submission wrapper in snakemake v7 and lower to manage most of my logging needs and wants. I am now following the best practices suggested by snakemake --lint. More specifically, I now declare log files explicitly for each rule using a log directive for each rule. As suggested by the linter, this approach is preferable as it generalises to local execution modes too. I am also happy with the additional logging information contained within .snakemake/slurm_logs

Taking these steps has resolved the issue for me and I am happy for this issue to be closed.

cmeesters commented 5 months ago

Thank you. However, note that we discussed (#16 ) to put the rule name in a comment string, which can be inquired by sacct too.

Looking forward to your contributions to the workflow collection!

cmeesters commented 5 months ago

closed