Open nikostr opened 7 months ago
Outch. Thanks for the report!
Can you please attach your minimal example? And perhaps a log create with snakemake --verbose ...
, too? That would be extremely helpful.
Sure!
workflow/Snakefile
:
rule all:
output:
'results/a'
shell:
''
workflow/profiles/default/config.yaml
:
executor: slurm
jobs: 1
retries: 2
default-resources:
slurm_account: <account>
runtime: f"{2 + attempt}h"
slurm_partition: core
and a slightly redacted version of the verbose log:
Using workflow specific profile workflow/profiles/default for setting default command line arguments.
Building DAG of jobs...
shared_storage_local_copies: True
remote_exec: False
SLURM run ID: d71a0ae6-210a-4886-b197-508e567eb099
Using shell: /usr/bin/bash
Provided remote nodes: 1
Job stats:
job count
----- -------
all 1
total 1
Resources before job selection: {'_cores': 9223372036854775807, '_nodes': 1}
Ready jobs (1)
Select jobs to execute...
Using greedy selector because only single job has to be scheduled.
Inferred runtime value of 180 minutes from 3h
Selected jobs (1)
Resources after job selection: {'_cores': 9223372036854775806, '_nodes': 0}
Execute 1 jobs...
[Fri Apr 5 11:29:29 2024]
rule all:
output: results/a
jobid: 0
reason: Missing output files: results/a
resources: mem_mb=1000, mem_mib=954, disk_mb=1000, disk_mib=954, tmpdir=<TBD>, slurm_account=$SLURM_ACCOUNT, runtime=180, slurm_partition=core
sbatch call: sbatch --job-name d71a0ae6-210a-4886-b197-508e567eb099 --output $DIR/snakemake-runtime-bug/.snakemake/slurm_logs/rule_all/%j.log --export=ALL --comment all -A $SLURM_ACCOUNT -p core -t 180 --mem 1000 --cpus-per-task=1 -D $DIR/snakemake-runtime-bug --wrap="$HOME/.conda/envs/snakemake/bin/python3.12 -m snakemake --snakefile $DIR/snakemake-runtime-bug/workflow/Snakefile --target-jobs all: --allowed-rules all --cores all --attempt 1 --force-use-threads --resources mem_mb=1000 mem_mib=954 disk_mb=1000 disk_mib=954 --wait-for-files $DIR/snakemake-runtime-bug/.snakemake/tmp._isktcvq --force --target-files-omit-workdir-adjustment --keep-storage-local-copies --max-inventory-time 0 --nocolor --notemp --no-hooks --nolock --ignore-incomplete --verbose --rerun-triggers mtime software-env code params input --conda-frontend mamba --shared-fs-usage input-output storage-local-copies software-deployment source-cache persistence sources --wrapper-prefix https://github.com/snakemake/snakemake-wrappers/raw/ --latency-wait 5 --scheduler ilp --local-storage-prefix .snakemake/storage --scheduler-solver-path $HOME/.conda/envs/snakemake/bin --default-resources 'mem_mb=min(max(2*input.size_mb, 1000), 8000)' 'disk_mb=max(2*input.size_mb, 1000)' tmpdir=system_tmpdir slurm_account=$SLURM_ACCOUNT 'runtime=f"{2 + attempt}h"' slurm_partition=core --executor slurm-jobstep --jobs 1 --mode remote"
unlocking
removing lock
removing lock
removed all locks
Full Traceback (most recent call last):
File "$HOME/.conda/envs/snakemake/lib/python3.12/site-packages/snakemake_executor_plugin_slurm/__init__.py", line 138, in run_job
out = subprocess.check_output(
^^^^^^^^^^^^^^^^^^^^^^^^
File "$HOME/.conda/envs/snakemake/lib/python3.12/subprocess.py", line 466, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "$HOME/.conda/envs/snakemake/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'sbatch --job-name d71a0ae6-210a-4886-b197-508e567eb099 --output $DIR/snakemake-runtime-bug/.snakemake/slurm_logs/rule_all/%j.log --export=ALL --comment all -A $SLURM_ACCOUNT -p core -t 180 --mem 1000 --cpus-per-task=1 -D $DIR/snakemake-runtime-bug --wrap="$HOME/.conda/envs/snakemake/bin/python3.12 -m snakemake --snakefile $DIR/snakemake-runtime-bug/workflow/Snakefile --target-jobs all: --allowed-rules all --cores all --attempt 1 --force-use-threads --resources mem_mb=1000 mem_mib=954 disk_mb=1000 disk_mib=954 --wait-for-files $DIR/snakemake-runtime-bug/.snakemake/tmp._isktcvq --force --target-files-omit-workdir-adjustment --keep-storage-local-copies --max-inventory-time 0 --nocolor --notemp --no-hooks --nolock --ignore-incomplete --verbose --rerun-triggers mtime software-env code params input --conda-frontend mamba --shared-fs-usage input-output storage-local-copies software-deployment source-cache persistence sources --wrapper-prefix https://github.com/snakemake/snakemake-wrappers/raw/ --latency-wait 5 --scheduler ilp --local-storage-prefix .snakemake/storage --scheduler-solver-path $HOME/.conda/envs/snakemake/bin --default-resources 'mem_mb=min(max(2*input.size_mb, 1000), 8000)' 'disk_mb=max(2*input.size_mb, 1000)' tmpdir=system_tmpdir slurm_account=$SLURM_ACCOUNT 'runtime=f"{2 + attempt}h"' slurm_partition=core --executor slurm-jobstep --jobs 1 --mode remote"' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "$HOME/.conda/envs/snakemake/lib/python3.12/site-packages/snakemake/cli.py", line 2052, in args_to_api
dag_api.execute_workflow(
File "$HOME/.conda/envs/snakemake/lib/python3.12/site-packages/snakemake/api.py", line 589, in execute_workflow
workflow.execute(
File "$HOME/.conda/envs/snakemake/lib/python3.12/site-packages/snakemake/workflow.py", line 1247, in execute
raise e
File "$HOME/.conda/envs/snakemake/lib/python3.12/site-packages/snakemake/workflow.py", line 1243, in execute
success = self.scheduler.schedule()
^^^^^^^^^^^^^^^^^^^^^^^^^
File "$HOME/.conda/envs/snakemake/lib/python3.12/site-packages/snakemake/scheduler.py", line 306, in schedule
self.run(runjobs)
File "$HOME/.conda/envs/snakemake/lib/python3.12/site-packages/snakemake/scheduler.py", line 394, in run
executor.run_jobs(jobs)
File "$HOME/.conda/envs/snakemake/lib/python3.12/site-packages/snakemake_interface_executor_plugins/executors/base.py", line 72, in run_jobs
self.run_job(job)
File "$HOME/.conda/envs/snakemake/lib/python3.12/site-packages/snakemake_executor_plugin_slurm/__init__.py", line 142, in run_job
raise WorkflowError(
snakemake_interface_common.exceptions.WorkflowError: SLURM job submission failed. The error message was sbatch: error: Script arguments not permitted with --wrap option
WorkflowError:
SLURM job submission failed. The error message was sbatch: error: Script arguments not permitted with --wrap option
I just tried replacing the runtime with str(2 + attempt) + "h"
and it seems to work! Would this be the recommended way to do this? Would it make sense to add this to the documentation?
EDIT: tried this again, and this time it protested. Additional verification needed.
I've created a minimal workflow and I set
runtime: f"{2 + attempt}h"
in my workflow profile. It is correctly parsed by snakemake in the sense that it printsruntime=180
as a part of the resources, but I get the errorSLURM job submission failed. The error message was sbatch: error: Script arguments not permitted with --wrap option
. I improvised the runtime specification since I couldn't find a documented way of doing it - is there a recommended/working way to specify dynamic runtimes in the profile?