Closed YiweiNiu closed 5 years ago
Hi,
bad_alloc usually means that system ran out of memory. How much RAM does your machine have? It seems that you have raw reads at 100x coverage - you might try to downsample them to, say 40x (take the longest ones), this should reduce memory requirements. You also should be able to rerun the repeat resolution step with the entire set of reads afterwards.
Notice that Canu reads have 30x coverage - so that is why less memory was required. If you running with error-corrected reads (which is also an option), make sure you are using 'pacbio-corr', not 'pacbio-raw' option.
Thank you for your reply! The computer node I submitted the job has 2T RAM. I don't know if it's enough.
I'll try to downsample the raw data. A basic question: what's the difference between using all raw data (say 100X) and using downsampling data (say longest 50X)? except the memory required.
You might have extra connectivity information in these 100x reads (you can resolve more repeats, for example). But some studies suggest (Canu paper, for example) that you don't really need more than 40x in general (but it, of course, also depends on the genome complexity, ploidy etc..). Plus, extra coverage helps to get a good final consensus.
I see. Thank you!
Hi, I have meet the same issue as "ERROR: Caught unhandled exception: std::bad_alloc" when i ran 30X and 90X ONT raw reads in our clsuter with 2 nodes, 300G RAM. I subsampled reads by canu-correction, then flye works for both data. However, why 30X ONT raw reads didn't works?
Here is error message
~/software/Flye/bin/flye -t 36 --nano-raw $fasta --genome-size 130m --out-dir
[2018-09-21 17:02:25] INFO: Running Flye 2.3.5-release
[2018-09-21 17:02:25] INFO: Assembling reads
[2018-09-21 17:02:25] INFO: Running with k-mer size: 17
[2018-09-21 17:02:25] INFO: Reading sequences
[2018-09-21 17:07:05] INFO: Reads N50/90: 21104 / 5385
[2018-09-21 17:07:05] INFO: Selected minimum overlap 5000
[2018-09-21 17:07:05] INFO: Expected read coverage: 26
[2018-09-21 17:07:05] INFO: Generating solid k-mer index
[2018-09-21 17:10:26] INFO: Counting kmers (1/2):
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2018-09-21 17:11:11] INFO: Counting kmers (2/2):
0% 10% 20% 30% 40% 50% 60% 70% [2018-09-21 17:18:04] ERROR: Caught unhandled exception: std::bad_alloc
[2018-09-21 17:18:04] ERROR: flye-assemble(_Z16exceptionHandlerv+0xd0) [0x43f530]
[2018-09-21 17:18:04] ERROR: /lib64/libstdc++.so.6(+0x5e926) [0x7fc45deec926]
[2018-09-21 17:18:04] ERROR: /lib64/libstdc++.so.6(+0x5e953) [0x7fc45deec953]
[2018-09-21 17:18:04] ERROR: /lib64/libstdc++.so.6(+0xb5275) [0x7fc45df43275]
[2018-09-21 17:18:04] ERROR: /lib64/libpthread.so.0(+0x7df5) [0x7fc45d761df5]
[2018-09-21 17:18:04] ERROR: /lib64/libc.so.6(clone+0x6d) [0x7fc45d48f1ad]
[2018-09-21 17:18:05] ERROR: Command '['flye-assemble', '-l', '/home/panpan/assembly_flye/col.NF.flye2/flye.log', '-t', '36', '/home/panpan/raw_porechoped_data/porechop_col_fastq/fasta/before_cont/porechop_col.NF.reads.fasta', '/home/panpan/assembly_flye/col.NF.flye2/0-assembly/draft_assembly.fasta', '130000000', '/home/panpan/software/Flye/flye/resource/asm_raw_reads.cfg']' returned non-zero exit status 1
Thank you panpan
Hi,
Looks strange, for a genome of ~130m and 30x coverage it should not use more than 50G.
Does the node that you are using to run Flye has 300G RAM (or you refer to the total memory of all nodes)? Could you send me the file.log file?
It would be also helpful if you can watch the memory consumption to make sure that it indeed ran out of memory. You can either manually watch top/htop, or use this script - https://github.com/jhclark/memusg.
Hi, I got this error messages when using version 2.3.2 and version 2.3.3.
The genome is about 2G, and default parameters were used.
version 2.3.2
version 2.3.3
BTW, I also ran
Flye 2.3.3
based on the corrected reads ofCanu
, and it ran successfully. Here is the logs if it's useful.Do you know what could have cause it? Thanks in advance!
Bests, Yiwei Niu