Memory allocation problem on a slurm system

Dictionary2b commented 6 months ago

Hello, I was running the workflow for an extensive data set (over 800 samples) on a slrum platform (UPPMAX). I used the GATK approach with intervals. I got an error message like this: rule create_cov_bed: input: results/GCA_009792885.1/summary_stats/all_cov_sumstats.txt, results/GCA_009792885.1/callable_sites/all_samples.d4 output: results/GCA_009792885.1/callable_sites/lark20231207_callable_sites_cov.bed jobid: 2558 benchmark: benchmarks/GCA_009792885.1/covbed/lark20231207_benchmark.txt reason: Missing output files: results/GCA_009792885.1/callable_sites/lark20231207_callable_sites_cov.bed; Input files updated by another job: results/GCA_009792885.1/summary_stats/all_cov_sumstats.txt, results/GCA_009792885.1/callable_sites/all_samples.d4 wildcards: refGenome=GCA_009792885.1, prefix=lark20231207 resources: mem_mb=448200, mem_mib=427437, disk_mb=448200, disk_mib=427437, tmpdir=

sbatch: error: Memory specification can not be satisfied sbatch: error: Batch job submission failed: Requested node configuration is not available Traceback (most recent call last): File "/crex/proj/uppstore2019097/nobackup/zongzhuang_larks_working/Final_variant_callinglarks/snpArcher/./profiles/slurm/slurm-submit.py", line 59, in print(slurm_utils.submit_job(jobscript, *sbatch_options)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/crex/proj/uppstore2019097/nobackup/zongzhuang_larks_working/Final_variant_callinglarks/snpArcher/profiles/slurm/slurm_utils.py", line 131, in submit_job raise e File "/crex/proj/uppstore2019097/nobackup/zongzhuang_larks_working/Final_variant_callinglarks/snpArcher/profiles/slurm/slurm_utils.py", line 129, in submit_job res = subprocess.check_output(["sbatch"] + optsbatch_options + [jobscript]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/sw/apps/conda/latest/rackham/lib/python3.11/subprocess.py", line 466, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/sw/apps/conda/latest/rackham/lib/python3.11/subprocess.py", line 571, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['sbatch', '--partition=core', '--time=7-00:00:00', '--ntasks=8', '--output=logs/slurm/slurm-%j.out', '--account=naiss2023-5-278', '--mem=448200', '/crex/proj/uppstore2019097/nobackup/zongzhuang_larks_working/Final_variant_callinglarks/snpArcher/.snakemake/tmp.fi5cgg1q/snakejob.create_cov_bed.2558.sh']' returned non-zero exit status 1. Error submitting jobscript (exit code 1):

Select jobs to execute...

There is not enough memory on the slurm system. I'm not sure where the issue is, whether due to I didn't ask for large enough memory in the setting or the HPC simply cannot provide that much memory. Should I probably change the source code to ask for a larger memory allocation for the workflow from the beginning (e.g. more than 1000 nodes)?

Besides, the whole workflow runs not that fast either; for the bam2vcf jobs, it reported a progress of 2% every 24 hours and obviously will go over the time limitation of the snakemake slurm job. Could you please give some suggestions on this? Thanks in advance.

Best, Zongzhuang

cademirch commented 6 months ago

Hi Zongzhuang, sorry for your issues with this. Could you provide your config, resource config, and command line used to run snparcher? The create_cov_bed step is pretty memory intensive so that could be the issue with regards to your posted error log. As for slow progress, that could be a number of things outside of snpArcher's control, such as HPC queue and resource limits. However, please post those things above, and we can try to diagnose.

Dictionary2b commented 6 months ago

@cademirch Thanks for your suggestion! Here are the config, resources, and the bash script I used to run snparcher (.sh). They are all archived in this zip file.

By asking the UPPMAX support, I got a suggestion to ask for a fat node partition job of at least 512 GB (-C 512 GB) instead of using the core partition as I did, which has at most 128 GB. Is this something I can change in the cluster-config file? Besides, since there are only a few fat nodes on the cluster, is it possible to make the workflow only submit the jobs with intensive memory requirements in the node partition, and the others keep with the previous setting? I'm also unsure if I need to kill the current process and restart it to apply the changes.

Zongzhuang_config.zip

cademirch commented 5 months ago

Hi Zonghuang,

I took a look and you configs look OK to me. One thing I will suggest is using the --slurm option when executing Snakemake instead of the profile. See these docs for more details: https://snakemake.readthedocs.io/en/v7.32.3/executing/cluster.html#executing-on-slurm-clusters

As for submitting certain rules to specific partitions this is possible, the docs above detail how. I would suggest creating a YAML profile and you can define which rules will go to which partition. I'll provide an example here:

uppmax_example_profile.yaml

slurm: True # same as `--slurm on command line`
jobs: 1000 # set number of jobs to run concurrently
use-conda: True
# can set other wanted command line options here
default-resources:
  slurm_partition: <Your partition name here> # This will be default partition for all rules
  slurm_account: <Your slurm account> # If applicable on your cluster
set-resources:
  create_cov_bed:
     slurm_partition: <Your big partition name here> # This will override default partition for this rule
 # ... you can specify partitions for certain rules by following this pattern

Then when you run snparcher you can do so with this profile. Let me know if this helps!

Dictionary2b commented 5 months ago

Hi Cade,

Many thanks for your explanation!

I'm unsure whether using the --slurm option without the --profile option can work in this case. The cpus-per-task issue on slurm system seems to persist still, which is now partially solved by an edition in profiles/slurm/slurm_utils.py.

If this is the case that I still have to use the --profile option, can I probably modify profiles/slurm/config.yaml (or something in cluster_config.yml?) for submitting certain rules to specific partitions?

cademirch commented 5 months ago

Okay, sorry I didn't realize that issue also. So in your shell script you sent above looks like this:

❯ cat run_pipeline_zongz1123.sh
#!/bin/bash
#SBATCH -A naiss2023-5-278
#SBATCH -p core
#SBATCH -n 1
#SBATCH -t 10-00:00:00
#SBATCH -J snpArcher
#SBATCH -e snpArcher_%A_%a.err # File to which STDERR will be written
#SBATCH -o snpArcher_%A_%a.out
#SBATCH --mail-type=all
#SBATCH --mail-user=dictionary2b@gmail.com
module load conda/latest

CONDA_BASE=$(conda info --base)
source $CONDA_BASE/etc/profile.d/conda.sh
mamba activate snparcher
snakemake --snakefile workflow/Snakefile --profile ./profiles/slurm

You would edit the file .profiles/slurm to include this:

default-resources:
  slurm_partition: <Your partition name here> # This will be default partition for all rules
  slurm_account: <Your slurm account> # If applicable on your cluster
set-resources:
  create_cov_bed:
     slurm_partition: <Your big partition name here> # This will override default partition for this rule
 # ... you can specify partitions for certain rules by following this pattern

Let me know if this makes sense and is helpful!

Dictionary2b commented 5 months ago

Thanks, Cade. .profiles/slurm is a directory containing both config. yaml and cluster_config.yml. In this case, I can't find the right place to include the code directly as you suggested. To my understanding, if I want to add a specific resource setting for certain jobs, I would need to add it to cluster_config.yml to make it look like this:

__default__:
    partition: "snowy"
    time: 7-00:00:00
    partition: core
    ntasks: 8
    output: "logs/slurm/slurm-%j.out"
    account: naiss2023-5-278

create_cov_bed:
    partition: "snowy"
    time: 7-00:00:00
    partition: node
    nodes: 1
    ntasks: 8
    constraint: mem512GB
    output: "logs/slurm/slurm-%j.out"
    account: naiss2023-5-278

Do I understand you correctly? Sorry for the misunderstanding!

cademirch commented 5 months ago

Ah my apologies, I messed up with what I posted. I believe what you have is correct. It's a bit confusing between the two main ways to run SLURM with snakemake. I will look more into the issue you posted above as well now that I have a slurm cluster to test on.

brian-arnold commented 4 months ago

Hello! Just to follow up on this discussion, how is memory getting determined for this rule? Is it modifiable and capabale of being run with lower memory? looking at the create_cov_bed rule I don't see any resources section.

The discussion above could be a potential solution, but we're running this rule and it's trying to request a ton of memory (1,700 GB) that may not exist on any node on our computing cluster (error message below saying "Requested node configuration is not available").

If it's useful information, we're using low-depth human sample (~325) mapped to the hg38 genome, which is quite complete.

Sincerley, Brian

[Wed Feb 7 13:22:58 2024] rule create_cov_bed: input: results/hg38/summary_stats/all_cov_sumstats.txt, results/hg38/callable_sites/all_samples.d4 output: results/hg38/callable_sites/past_and_turk_callable_sites_cov.bed jobid: 1634 benchmark: benchmarks/hg38/covbed/past_and_turk_benchmark.txt reason: Missing output files: results/hg38/callable_sites/past_and_turk_callable_sites_cov.bed wildcards: refGenome=hg38, prefix=past_and_turk resources: mem_mb=1743686, mem_mib=1662909, disk_mb=1743686, disk_mib=1662909, tmpdir= sbatch: error: Memory specification can not be satisfied sbatch: error: Batch job submission failed: Requested node configuration is not available Traceback (most recent call last): File "/Genomics/ayroleslab2/emma/snpArcher/profiles/slurm/slurm-submit.py", line 59, in print(slurm_utils.submit_job(jobscript, *sbatch_options)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Genomics/ayroleslab2/emma/snpArcher/profiles/slurm/slurm_utils.py", line 131, in submit_job raise e File "/Genomics/ayroleslab2/emma/snpArcher/profiles/slurm/slurm_utils.py", line 129, in submit_job res = subprocess.check_output(["sbatch"] + optsbatch_options + [jobscript]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Genomics/argo/users/emmarg/.conda/envs/snparcher/lib/python3.12/subprocess.py", line 466, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Genomics/argo/users/emmarg/.conda/envs/snparcher/lib/python3.12/subprocess.py", line 571, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['sbatch', '--time=9000', '--nodes=1', '--mem=1743686', '-- output=logs/slurm/slurm-%j.out', '--cpus-per-task=1', '/Genomics/ayroleslab2/emma/snpArcher/past_and_turk/.snakemake/tmp.lh6xpm8u/snakejob.create_cov_bed.1634.sh']' returned non-zero exit status 1. Error submitting jobscript (exit code 1):

tsackton commented 4 months ago

We have seen this a number of times - the default memory specification seems to go off the rails for a reason we don't yet understand.

One solution is to just define mem_mb = some other reasonable number in the snakemake rule directly, in the resources section.

We are hoping to debug this but so far haven't tracked down the problem.

cademirch commented 4 months ago

I think this may be happening since create_cov_bed is not defined in the resources yaml, so Snakemake comes up with a default: https://github.com/snakemake/snakemake/blob/0998cc57cbd02c38d1a3bbf1662c8b23b7601e20/snakemake/resources.py#L11-L16

Erythroxylum commented 4 months ago

Hello, this issue has failed at the qc module of 2/3 otherwise successful runs. I have defined

resources:
    mem_mb = 16000

in the workflow/modules/qc/Snakemake file before the 'run' or 'shell' command in every rule, but the error and job failure persist. Snakefile and err file attached.

As you say Cade, the first error is for create_cov_bed, which is not a rule on this Snakefile. err336.txt Snakefile.txt

Dictionary2b commented 4 months ago

Ah my apologies, I messed up with what I posted. I believe what you have is correct. It's a bit confusing between the two main ways to run SLURM with snakemake. I will look more into the issue you posted above as well now that I have a slurm cluster to test on.

Thanks, Cade. The workflow is now appropriately finished. Defining the specific resource allocation of each job in cluster_config.yml, as I did, is the solution. : )

harvardinformatics / snpArcher

Memory allocation problem on a slurm system #148