isovic / racon

Ultrafast consensus module for raw de novo genome assembly of long uncorrected reads. http://genome.cshlp.org/content/early/2017/01/18/gr.214270.116 Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/racon
MIT License
268 stars 48 forks source link

terminate called after throwing an instance of 'std::bad_alloc' #122

Open Timothy-Amos opened 5 years ago

Timothy-Amos commented 5 years ago

Hi, I got a bad_alloc error:

[z3101118@kc11b14 2019-05-16.TigerWTDBG2Medaka]$ module add racon/1.3.2
[z3101118@kc11b14 2019-05-16.TigerWTDBG2Medaka]$ OUT_DIR=/srv/scratch/babsgenome/BABSGenome-Jun17/assemblies/2019-05-16.TigerWTDBG2Medaka
[z3101118@kc11b14 2019-05-16.TigerWTDBG2Medaka]$ READS=/srv/scratch/babsgenome/BABSGenome-Jun17/data/2019-01-22.Nanopore/Tiger/LXBAB153514.fastq
[z3101118@kc11b14 2019-05-16.TigerWTDBG2Medaka]$ ASSEMBLY=/srv/scratch/babsgenome/BABSGenome-Jun17/assemblies/2019-01-24.TigerWTDBG2/tiger.wtdbg2v1.fasta
[z3101118@kc11b14 2019-05-16.TigerWTDBG2Medaka]$ DATE=2019-05-22
[z3101118@kc11b14 2019-05-16.TigerWTDBG2Medaka]$ PREFIX=tiger_wtdbg2.$DATE
[z3101118@kc11b14 2019-05-16.TigerWTDBG2Medaka]$ SAM=/srv/scratch/babsgenome/BABSGenome-Jun17/assemblies/2019-01-24.TigerWTDBG2/tiger.wtdbg2v1.sam
[z3101118@kc11b14 2019-05-16.TigerWTDBG2Medaka]$ PPN=14
[z3101118@kc11b14 2019-05-16.TigerWTDBG2Medaka]$ racon -t $PPN $READS $SAM $ASSEMBLY -m 8 -x -6 -g -8 -w 500 > ${PREFIX}.racon.fasta
[racon::Polisher::initialize] loaded target sequences
[racon::Polisher::initialize] loaded sequences
[racon::Polisher::initialize] loaded overlaps
[racon::Polisher::initialize] aligned overlap 4298734/4298734
[racon::Polisher::initialize] transformed data into windows
[racon::Window::generate_consensus] warning: contig 220 might be chimeric in window 542!
terminate called after throwing an instance of 'std::bad_alloc'
 what():  std::bad_alloc
Aborted (core dumped)

I don't think it ran out of memory, since I had 1 TB of memory available and the inputs are only ~100 GB each:

[z3101118@katana1 2019-05-16.TigerWTDBG2Medaka]$ ls -lh /srv/scratch/babsgenome/BABSGenome-Jun17/assemblies/2019-01-24.TigerWTDBG2/tiger.wtdbg2v1.sam
-rw-rw-r--. 1 z3452659 babsgenome 122G Jan 28 17:37 /srv/scratch/babsgenome/BABSGenome-Jun17/assemblies/2019-01-24.TigerWTDBG2/tiger.wtdbg2v1.sam
[z3101118@katana1 2019-05-16.TigerWTDBG2Medaka]$ ls -lh /srv/scratch/babsgenome/BABSGenome-Jun17/data/2019-01-22.Nanopore/Tiger/LXBAB153514.fastq
-rw-rw-r--. 1 z3452659 babsgenome 97G Jan 21 22:52 /srv/scratch/babsgenome/BABSGenome-Jun17/data/2019-01-22.Nanopore/Tiger/LXBAB153514.fastq

I see that some people have been told to rename reads or that long CIGAR reads could be a problem. I haven't investigated either potential issue.

Any thoughts?

rvaser commented 5 years ago

Hello, the error might also indicate that an inadequate value has been passed to some memory allocation function. No idea how or what caused it as file parsing went well. You can first try and map your reads with minimap2 without alignment, and pass the resulting paf to racon. If that does not work, could you share your data so I can investigate locally? Or if that is not possible, would you help me investigate by modifying some parts of the code so we can locate the culprit?

Best regards, Robert