Error - std::bad_alloc during "calling consensus sequence between anchors"

jpummil commented 4 years ago

[NOTE] calling consensus sequence between anchors... terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

Compute resource seems to be ample from a memory perspective...node has 768GB of ram. Monitoring with top, the task seemed to be around 330GB when error occurred.

Input data: (Haslr) pinnacle-l4:jpummil:/scrfs/storage/jpummil/C.vittatus$ ls -lh *.fastq -rw-r--r--. 1 jpummil jpummil 13G Feb 6 11:07 NU2WGS_R1.fastq -rw-r--r--. 1 jpummil jpummil 13G Feb 6 11:09 NU2WGS_R2.fastq -rw-r--r--. 1 jpummil jpummil 31G Feb 6 11:16 Q1133andQ1171.fastq

input script: haslr.py -t 24 -g 700m -o RUN1 -l Q1133andQ1171.fastq -x pacbio -s NU2WGS_R1.fastq NU2WGS_R2.fastq

haghshenas commented 4 years ago

Hi Jeff,

Sorry that I was busy with other stuff and got back to this so late. Is the issue fixed? If yes, could you please let me know what the problem was and how you fixed it?

jstrickland63 commented 4 years ago

Hello, I also got this same error. I ran it on a node with 80 CPUs and 1500GB of RAM. I was able to run the sample dataset with no problems. The output is below for the .err file. Please let me know what other information you might need. I will also start tinkering with parameters as jelber2 did in #11 and see if I get lucky.

[NOTE] number of threads: 80

[NOTE] loading contig sequences... processing file: /scratch2/jlstrck/Snake_HASLR_8March2020/sr_k49_a3.contigs.nooverlap.fa... Done in 7.63 CPU seconds (7.63 real seconds) loaded 16708266 contigs elapsed time 8.25 CPU seconds (10.37 real seconds)

[NOTE] calculating kmer frequency of unique contigs mean: 67.42 elapsed time 9.85 CPU seconds (11.96 real seconds)

[NOTE] loading long read sequences... processing file: /scratch2/jlstrck/Snake_HASLR_8March2020/lr25x.fasta... Done in 70.21 CPU seconds (70.23 real seconds) loaded 4213353 long reads elapsed time 80.08 CPU seconds (82.21 real seconds)

[NOTE] loading alignment between contigs and long reads... processing file: /scratch2/jlstrck/Snake_HASLR_8March2020/map_contigs_k49_a3_lr25x.paf... Done in 103.05 CPU seconds (103.11 real seconds) loaded 9169927 alignments elapsed time 192.75 CPU seconds (218.77 real seconds)

[NOTE] fixing overlapping alignments... elapsed time 225.54 CPU seconds (251.56 real seconds)

[NOTE] building compact long reads... elapsed time 229.36 CPU seconds (255.92 real seconds)

[NOTE] building the backbone graph... elapsed time 239.56 CPU seconds (268.25 real seconds)

[NOTE] cleaning weak edges... removed 192666 edges elapsed time 247.01 CPU seconds (277.75 real seconds)

[NOTE] cleaning tips... removed 1324 tips elapsed time 254.49 CPU seconds (287.72 real seconds)

[NOTE] cleaning simple bubbles... removed 11407 simple bubbles elapsed time 261.54 CPU seconds (296.75 real seconds)

[NOTE] cleaning super bubbles... removed 265 super bubbles elapsed time 268.81 CPU seconds (306.35 real seconds)

[NOTE] cleaning small bubbles... removed 24 small bubbles elapsed time 275.91 CPU seconds (316.15 real seconds)

[NOTE] calculating long read coordinates between anchors... elapsed time 6082.05 CPU seconds (394.16 real seconds)

[NOTE] calling consensus sequence between anchors... terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

haghshenas commented 4 years ago

I think I was able to reproduce this bug. I'm working on it and will update you.

palfalvi commented 4 years ago

Hi @haghshenas !

I also got the same error recently on 35 CPUs and 2800 GB RAM trying to assemble a 2.3Gbp genome with illumina and ont data. Just would like to know if there is any solution or workaround. Would be great to test this assembler!

CIWa commented 4 years ago

Hi,

Thank you very much for offering this great tool! For supporting the issue: I am using a single node with 2 TB of RAM, for a 6 GB genome, PacBio CLR + Illumina PE data. RAM was okay before the crash (< 60 %). My version is 0.8a1, build successfully from source.

My call: haslr.py -t 126 -o ./ -g 6g -l pacbio.fa -x pacbio -s R1.fastq R2.fastq

The error messages are: [29-Jun-2020 10:07:29] assembling long reads using HASLR... failed ERROR: "haslr_assemble" returned non-zero exit status

And in asm_contigs_k49_a3_c250_lr25x_b500_s3_sim0.85.err: [NOTE] calling consensus sequence between anchors... terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

Best wishes, Isabel

drs commented 3 years ago

Hi @haghshenas,

I also got the same error trying to assemble a small genome (32.5 Mb) (Thalassiosira pseudonana) on two different system (Ubuntu 20.04 - 64 Gb RAM - 40 Threads and Debian 9 - 400 Gb RAM - 40 Threads).

I use different subsample (15X to 100X coverage) of a large dataset (SRA SRR7762361) to assemble the genome and the only dataset that fails with this error is the dataset with 15X coverage.

The command I used is :

$ haslr.py -o Assembly -g 12.2m -x nanopore -t 40 \
-s Illumina-R1.fastq Illumina-R2.fastq Illumina-Singles.fastq -l Nanopore-15X.fastq

I do not have any clue on why this specific dataset failed or if this is really related to this issue since this seesm to affect large gigabase sized genomes. I hope this additional information can be any valuable to you.

Regards, Samuel Drouin

koujiaodahan commented 3 years ago

So, what's the solution to "terminate called after throwing an instance of 'std::bad_alloc'"

jessiepelosi commented 2 years ago

I've also run into this issue with the same error during calling consensus sequence between anchors...:

terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

Has there been any progress on solving this issue since 2020?

Looking at this issue from STAR, it's likely not an issue with available memory but the number of headers in the initial short read assembly. Any advice/fixes?

vpc-ccg / haslr

Error - std::bad_alloc during "calling consensus sequence between anchors" #5