kevlar-dev / kevlar

Reference-free variant discovery in large eukaryotic genomes
https://kevlar.readthedocs.io
MIT License
40 stars 9 forks source link

kevlar memory error #378

Open sph17 opened 4 years ago

sph17 commented 4 years ago

While trying the kevlar tutorial workflow, at kevlar partition step some files fail with memory error, I used 64g and 64g swap memory.

[kevlar::partition] Building read graph in relaxed mode
Traceback (most recent call last):
  File "/PHShome/sph35/.conda/envs/kevlarClone/bin/kevlar", line 8, in <module>
  File "/PHShome/sph35/.conda/envs/kevlarClone/lib/python3.7/site-packages/kevlar/__main__.py", line 31, in main
  File "/PHShome/sph35/.conda/envs/kevlarClone/lib/python3.7/site-packages/kevlar/partition.py", line 68, in main
  File "/PHShome/sph35/.conda/envs/kevlarClone/lib/python3.7/site-packages/kevlar/partition.py", line 33, in partition
  File "/PHShome/sph35/.conda/envs/kevlarClone/lib/python3.7/site-packages/kevlar/readgraph.py", line 125, in populate_edges
  File "/PHShome/sph35/.conda/envs/kevlarClone/lib/python3.7/site-packages/networkx/classes/graph.py", line 920, in add_edge
standage commented 4 years ago

Sorry for the delayed response. Can you show the the precise commands you ran (starting with kevlar count) that led to this error message? How much memory does your computer have?

sph17 commented 4 years ago

Here are the scripts I ran

#!/bin/bash
#BSUB -J Kevlar_ArrayJob[6]
#BSUB -e /err_files/targz.err
#BSUB -q big
#BSUB -n 1
#BSUB mem=16G
#BSUB swp=16G
#BSUB -sla miket_sc

SOURCE_FILE_1="/kevlar/files_counttable.txt"
SOURCE_FILE_2="/kevlar/files_fastq.txt"

INPUT_1=$(sed -n ''${LSB_JOBINDEX}'p' ${SOURCE_FILE_1})
INPUT_2=$(sed -n ''${LSB_JOBINDEX}'p' ${SOURCE_FILE_2})

filename=$(basename -- "$INPUT_2")
extension="${filename##*.}"
filename="${filename%.fastq}"

OUTPUT=${filename}_novel
source activate kevlar
N_THREADS=$(nproc)
kevlar novel --out ${OUTPUT}_reads.fastq \
    --control-counts /data/talkowski/sphao/projects/kevlar/YM003-C.counttable \
    --case-counts ${INPUT_1} --case ${INPUT_2}

#!/bin/bash
#BSUB -J Kevlar_ArrayJob[6]
#BSUB -o /log_files/kevlarArrayJob04.log
#BSUB -e /files/err_files/targz04.err
#BSUB -q big
#BSUB -n 1
#BSUB mem=64G
#BSUB swp=64G
#BSUB -sla miket_sc

SOURCE_FILE_1="/kevlar/files_novel_reads_filtered.txt"
INPUT_1=$(sed -n ''${LSB_JOBINDEX}'p' ${SOURCE_FILE_1})

filename=$(basename -- "$INPUT_1")
extension="${filename##*.}"
filename="${filename%_novelReadsFiltered}"

OUTPUT=${filename}_partition
source activate kevlar
kevlar partition -o ${OUTPUT} ${INPUT_1}

I first ran partition at 16G as well then tried 64G