Hi, I'm evaluating RUFUS to potentially incorporate into our exome and genome variant calling pipelines. We recently got installed on the HPC (sounds like with your help, thanks much!). I was taking for a test-drive today, and I got the following test run to work fine, producing the single de novo variant it is supposed to after a pretty quick runtime:
However, when I try to run a very similar script, but substituting some of our genomes in, I run into a memory issue (tail of slurm error report):
"~~~~ printing out paramater values used in script ~~~~
value of ProbandGenerator LUR_005_01_noalt_hg38_sort.bam.generator
Value of ParentGenerators:
LUR_005_02_noalt_hg38_sort.bam.generator
LUR_005_03_noalt_hg38_sort.bam.generator
Value of K is: 25
Value of Threads is: 24
value of ref is: /projects/b1073/pipelines/commonref/GRCh38/GRCh38_NoContigs.primary_assembly.genome.fa
value of min is:
Did not provide refHash
$_arg_min is empty
_arg_min is
MutantMinCov is
parent is LUR_005_02_noalt_hg38_sort.bam.generator
parent is LUR_005_03_noalt_hg38_sort.bam.generator
Running jellyfish for LUR_005_02_noalt_hg38_sort.bam.generator
Running jellyfish for LUR_005_01_noalt_hg38_sort.bam.generator
Running jellyfish for LUR_005_03_noalt_hg38_sort.bam.generator
slurmstepd: error: Job 475163 exceeded memory limit (103113338880 > 75161927680)"
I used slurm to specify 70 gigs of memory, I figured that would be enough for one trio on hg38, but maybe not? Any advice? Thanks in advance!
Take care,
Jeff (Postdoc in the Carvill lab at Northwestern)
PS - Here is full script:
"#!/bin/bash
#SBATCH -A b1042
#SBATCH -p genomics
#SBATCH -N 1
#SBATCH -n 24
#SBATCH -t 40:00:00
#SBATCH --mem=70gb
# rufus_test.sh
# running on trio_LUR005 for a first test
iFolder="/projects/b1073/WGS_peds_epilepsies/trio_LUR005/bams"
oFolder="/projects/b1073/WGS_peds_epilepsies/jdc_sandbox/rufus"
genomeFASTA="/projects/b1073/pipelines/commonref/GRCh38/GRCh38_NoContigs.primary_assembly.genome.fa"
module purge all
module load rufus/latest
module load samtools/1.6
# bash runRufus.sh --subject Child.bam --controls Mother.bam --controls Father.bam --kmersize 25 --threads 40 --ref human_reference_v37_decoys.fa
# hmm specifying output directory isn't listed ??? does it print to stdout? do I need > file.txt ?
sh /software/RUFUS/runRufus.sh --subject ${iFolder}/LUR_005_01_noalt_hg38_sort.bam --controls ${iFolder}/LUR_005_02_noalt_hg38_sort.bam --controls ${iFolder}/LUR_005_03_noalt_hg38_sort.bam --kmersize 25 --threads 24 --ref $genomeFASTA
sh /software/RUFUS/runRufus.sh --subject ${iFolder}/LUR_005_01_noalt_hg38_sort.bam --controls ${iFolder}/LUR_005_02_noalt_hg38_sort.bam --controls ${iFolder}/LUR_005_03_noalt_hg38_sort.bam --kmersize 25 --threads 24 --ref $genomeFASTA > ${oFolder}/file.txt
exit"
PPS - The script didn't crash per se, but appears to have stalled. The Jelly.chr files are slowly being updated as it runs. The generator Jhash.temp & fq files are empty
It wasn't actually stalled, after a couple of hours, we got de novo variants. Closing issue. This tool looks like it is going to be super useful, thanks for building and making available!
Hi, I'm evaluating RUFUS to potentially incorporate into our exome and genome variant calling pipelines. We recently got installed on the HPC (sounds like with your help, thanks much!). I was taking for a test-drive today, and I got the following test run to work fine, producing the single de novo variant it is supposed to after a pretty quick runtime:
"#!/bin/bash
SBATCH -A b1042
SBATCH -p genomics
SBATCH -N 1
SBATCH -n 24
SBATCH -t 40:00:00
SBATCH --mem=70gb
rufus_test2.sh
running on trio_LUR005 for a first test
iFolder="/software/RUFUS/testRun/" oFolder="/projects/b1073/WGS_peds_epilepsies/jdc_sandbox/rufus"
genomeFASTA="/projects/b1073/pipelines/commonref/GRCh38/GRCh38_NoContigs.primary_assembly.genome.fa"
module purge all
module load rufus/latest module load samtools/1.6
bash runRufus.sh --subject Child.bam --controls Mother.bam --controls Father.bam --kmersize 25 --threads 40 --ref human_reference_v37_decoys.fa
hmm specifying output directory isn't listed ??? does it print to stdout? do I need > file.txt ?
sh /software/RUFUS/runRufus.sh -s ${iFolder}/Child.bam -c ${iFolder}/Mother.bam -c ${iFolder}/Father.bam -k 25 -t 24 -m 8 -r /software/RUFUS/resources/references/small_test_human_reference_v37_decoys.fa
sh /software/RUFUS/runRufus.sh -s ${iFolder}/Child.bam -c ${iFolder}/Mother.bam -c ${iFolder}/Father.bam -k 25 -t 24 -m 8 -r /software/RUFUS/resources/references/small_test_human_reference_v37_decoys.fa > ${oFolder}/file2.txt
exit"
However, when I try to run a very similar script, but substituting some of our genomes in, I run into a memory issue (tail of slurm error report):
"
~~~~ printing out paramater values used in script~~~~ value of ProbandGenerator LUR_005_01_noalt_hg38_sort.bam.generator Value of ParentGenerators: LUR_005_02_noalt_hg38_sort.bam.generator LUR_005_03_noalt_hg38_sort.bam.generator Value of K is: 25 Value of Threads is: 24 value of ref is: /projects/b1073/pipelines/commonref/GRCh38/GRCh38_NoContigs.primary_assembly.genome.fa value of min is: