jdblischak / smk-simple-slurm

A simple Snakemake profile for Slurm without --cluster-config
Creative Commons Zero v1.0 Universal
130 stars 16 forks source link

snakemake throws missing output files error when I use `--immediate-submit` option #24

Closed imsarath closed 3 months ago

imsarath commented 3 months ago

I used your SLURM profile, and it works well. However, when I try to submit jobs to SLURM using the --immediate-submit option, Snakemake submits the initial batch of jobs but then immediately tries to check the output files, even though the jobs are still running in the SLURM queue. As a result, it throws errors about missing output files.

I don't understand why this is happening. Could you provide some guidance on how to resolve this issue?

snakemake: v8.12.0

jdblischak commented 3 months ago

From the Snakemake docs:

Immediately submit all jobs to the cluster instead of waiting for present input files. This will fail, unless you make the cluster aware of job dependencies, e.g. via: $ snakemake –cluster ‘sbatch –dependency {dependencies}. Assuming that your submit script (here sbatch) outputs the generated job id to the first stdout line, {dependencies} will be filled with space separated job ids this job depends on. Does not work for workflows that contain checkpoint rules.

In other words, this won't work unless you manage the dependencies yourself via Slurm (example). Personally, that seems like a lot of extra work/complexity when Snakemake already handles this for you.

What is your motivation for using --immediate-submit?

imsarath commented 3 months ago

In our research team, we developed our pipelines using this feature --immediate-submit in snakemake (6.2.1). Now we are trying to update the snakemake (>=8).

jdblischak commented 3 months ago

In our research team, we developed our pipelines using this feature --immediate-submit in snakemake (6.2.1)

Were you using this smk-simple-slurm profile for this previous --immediate-submit setup with Snakemake 6.2.1?

Now we are trying to update the snakemake (>=8).

Have you made all the various required updates to migrate to snakemake >8? The latest Snakemake 8 version of this profile and documentation was merged only recently on June 20th (b91a2284b1d2bbe8ec3f0bf2e157ab63c9024d13, #23)

And in general, a minimal reproducible example would be helpful.

imsarath commented 3 months ago

No, we were using snakemake slurm profiles . We used python script from this repo.

Yes, I have updated all the required changes to migrate snakemake >8.

Here, I used simple shell script to add dependencies to the jobs submission.

profile/slurm/config.yaml

executor: cluster-generic
cluster-generic-submit-cmd:
  mkdir -p logs/cluster/ &&
  sbatch
    --partition={resources.partition}
    --cpus-per-task={threads}
    --job-name=smk-{rule}-{wildcards}
    --output=logs/cluster/smk.{rule}-{wildcards}-%j.out
    --parsable
    $(bash /path/to/parseJobID.sh {dependencies})
default-resources:
  - partition=core
  - mem_mb=4G
restart-times: 1
max-jobs-per-second: 10
max-status-checks-per-second: 1
local-cores: 1
latency-wait: 60
jobs: 500
keep-going: True
rerun-incomplete: True
printshellcmds: True
scheduler: greedy
use-singularity: True
jobscript: slurm_jobscript.sh
use-conda: False
software-deployment-method: apptainer

/path/to/parseJobID.sh

#!/bin/bash
# helper script that parses slurm output for the job ID,
# and feeds it to back to snakemake/slurm for dependencies.
# This is required when you want to use the snakemake --immediate-submit option

if [[ "Submitted batch job" =~ "$@" ]]; then
  echo -n ""
else
  deplist=$(grep -Eo '[0-9]{1,10}' <<< "$@" | tr '\n' ',' | sed 's/.$//')
  echo -n "--dependency=afterok:$deplist"
fi;
jdblischak commented 3 months ago

Here, I used simple shell script to add dependencies to the jobs submission.

@imsarath Very cool! Thanks for sharing. I had never tried this approach before.

Unfortunately I no longer have convenient access to an HPC cluster with Slurm (I no longer consult for the client I developed this profile for initially), so I can't actively troubleshoot this. @JoshLoecker do you have the bandwidth to try parseJobID.sh?

JoshLoecker commented 3 months ago

Hi @imsarath, I've never used snakemake like this and, like @jdblischak said, trying to manage dependencies this way will almost certainly be more trouble than it's worth. You're better off relying on snakemake to handle the input and output for you by removing $(bash /path/to/parseJobID.sh {dependencies}) from your config.yaml file. The mock Snakefile below allows snakemake to handle dependencies. Depending on how your workflow is set up, it may need to be changed to this format once the dependencies are removed from the configuration.

rule all:
    input: "output_file.txt"

rule a:
    input: "sample_data.csv"
    output: "rule_a_output.txt"
    bash: "touch {output}"

rule b:
    input: rules.a.output
    output: "output_file.txt"
    bash: "touch {output}"

I'll do my best to help no matter if you keep or remove the --immediate-submit option :)

Can you post the following items?

  1. An exampleSnakefile for your workflow
  2. The slurm_jobscript.sh script. I'm not sure if this is actually used to submit jobs since cluster-generic-submit-cmd is also defined, but I want to match your environment as closely as possible
imsarath commented 3 months ago

Hi @jdblischak @JoshLoecker, thanks for helping me. It looks like there's an issue with Snakemake, and I'm considering removing the --immediate-submit option from our pipeline.