Open wangzhennan14 opened 6 years ago
Could you check if you have enough memory?
Hello,
Thank you for the great tool. I am also getting a seg fault.
[M::main] ===> Step 1: reading read mappings <===
/opt/gridengine/default/spool/compute-9-3/job_scripts/1347083: line 19: 163580 Segmentation fault (core dumped) ../miniasm/miniasm -f /scratch/genomics02/frandsenp/concat_reads/all_reads.fasta.gz reads.paf.gz > reads.gfa
At first, I thought it was because of memory, then I allocated 250GB and it segfaults when it is using ~130GB (and the resulting core dump is about 129 GB). I am now running it with 500GB of RAM, just in case, but if you have any insight on what I might adjust in the meantime, I am very open to it.
Thank you,
Paul
I'm also seeing a segmentation fault in Step 3. It's using 250 GB of RAM, and the machine has 2.5 TB of RAM, so memory usage should be okay. This assembly succeeded with miniasm -c2
, but failed with miniasm -c3
. I'll try it once more.
❯❯❯ miniasm -c3 -f Q903_11.fq.gz Q903_11.minimap2.paf.gz >Q903_11.minimap2.c3.miniasm.gfa
[M::main] ===> Step 1: reading read mappings <===
[M::ma_hit_read::33696.481*0.98] read 6043539395 hits; stored 8042486383 hits and 4275859 sequences (53639474019 bp)
[M::main] ===> Step 2: 1-pass (crude) read selection <===
[M::ma_hit_sub::51174.764*0.99] 3888964 query sequences remain after sub
[M::ma_hit_cut::59938.663*0.99] 6138438897 hits remain after cut
[M::ma_hit_flt::60340.037*0.99] 2636615608 hits remain after filtering; crude coverage after filtering: 445.30
[M::main] ===> Step 3: 2-pass (fine) read selection <===
[M::ma_hit_sub::60693.790*0.99] 3681650 query sequences remain after sub
[M::ma_hit_cut::60946.122*0.99] 2314586291 hits remain after cut
While building Q903_11.minimap2.c3.miniasm.gfa: Error 139 executing command time -v -o Q903_11.minimap2.c3.miniasm.gfa.time miniasm -c3 -f Q903_11.fq.gz Q903_11.minimap2.paf.gz >Q903_11.minimap2.c3.miniasm.gfa
Deleting Q903_11.minimap2.c3.miniasm.gfa
Command exited with non-zero status 2
11527.35user 49308.91system 17:29:40elapsed 96%CPU (0avgtext+0avgdata 251763444maxresident)k
253003824inputs+88outputs (0major+1440637832minor)pagefaults 0swaps
Command terminated by signal 11
User time (seconds): 11524.88
System time (seconds): 49308.86
Percent of CPU this job got: 96%
Elapsed (wall clock) time (h:mm:ss or m:ss): 17:29:37
Maximum resident set size (kbytes): 251763444
Do you have the log file for -c2
? BTW, using -Rc2
usually uses less memory at the cost of performance. Sometimes -Rc2
may give better assembly than -c3
.
I ran miniasm -c3
a second time, and I saw the same log and segmentation fault, so it seems repeatable.
Here's the log for miniasm -c2
❯❯❯ miniasm -c2 -f Q903_11.fq.gz Q903_11.minimap2.paf.gz >Q903_11.minimap2.c2.miniasm.gfa
[M::main] ===> Step 1: reading read mappings <===
[M::ma_hit_read::24348.043*0.99] read 6043539395 hits; stored 8042486383 hits and 4275859 sequences (53639474019 bp)
[M::main] ===> Step 2: 1-pass (crude) read selection <===
[M::ma_hit_sub::41732.276*0.99] 4042833 query sequences remain after sub
[M::ma_hit_cut::44159.173*0.99] 6725880000 hits remain after cut
[M::ma_hit_flt::45356.645*0.99] 2252966414 hits remain after filtering; crude coverage after filtering: 357.92
[M::main] ===> Step 3: 2-pass (fine) read selection <===
[M::ma_hit_sub::45694.031*0.99] 3926323 query sequences remain after sub
[M::ma_hit_cut::46010.082*0.99] 1976194002 hits remain after cut
[M::ma_hit_contained::47574.346*0.99] 679121 sequences and 30233329 hits remain after containment removal
[M::main] ===> Step 4: graph cleaning <===
[M::ma_sg_gen] read 14964141 arcs
[M::main] ===> Step 4.1: transitive reduction <===
[M::asg_arc_del_trans] transitively reduced 2690476 arcs
[M::asg_arc_del_multi] removed 39354 multi-arcs
[M::asg_arc_del_asymm] removed 208039 asymmetric arcs
[M::main] ===> Step 4.2: initial tip cutting and bubble popping <===
[M::asg_cut_tip] cut 342548 tips
[M::asg_pop_bubble] popped 702 bubbles and trimmed 410 tips
[M::main] ===> Step 4.3: cutting short overlaps (3 rounds in total) <===
[M::asg_arc_del_multi] removed 0 multi-arcs
[M::asg_arc_del_asymm] removed 751999 asymmetric arcs
[M::asg_arc_del_short] removed 1752535 short overlaps
[M::asg_cut_tip] cut 61021 tips
[M::asg_pop_bubble] popped 701 bubbles and trimmed 326 tips
[M::asg_arc_del_multi] removed 0 multi-arcs
[M::asg_arc_del_asymm] removed 139551 asymmetric arcs
[M::asg_arc_del_short] removed 179583 short overlaps
[M::asg_cut_tip] cut 15799 tips
[M::asg_pop_bubble] popped 338 bubbles and trimmed 169 tips
[M::asg_arc_del_multi] removed 0 multi-arcs
[M::asg_arc_del_asymm] removed 83251 asymmetric arcs
[M::asg_arc_del_short] removed 105729 short overlaps
[M::asg_cut_tip] cut 9249 tips
[M::asg_pop_bubble] popped 280 bubbles and trimmed 182 tips
[M::main] ===> Step 4.4: removing short internal sequences and bi-loops <===
[M::asg_cut_internal] cut 2743 internal sequences
[M::asg_cut_biloop] cut 15947 small bi-loops
[M::asg_cut_tip] cut 1398 tips
[M::asg_pop_bubble] popped 31 bubbles and trimmed 20 tips
[M::main] ===> Step 4.5: aggressively cutting short overlaps <===
[M::asg_arc_del_multi] removed 0 multi-arcs
[M::asg_arc_del_asymm] removed 45288 asymmetric arcs
[M::asg_arc_del_short] removed 57436 short overlaps
[M::asg_cut_tip] cut 5713 tips
[M::asg_pop_bubble] popped 195 bubbles and trimmed 161 tips
[M::main] ===> Step 5: generating unitigs <===
[M::main] Version: 0.2-r128
[M::main] CMD: miniasm -c2 -f Q903_11.fq.gz Q903_11.minimap2.paf.gz
[M::main] Real time: 48352.940 sec; CPU: 48032.413 sec
Command being timed: "miniasm -c2 -f Q903_11.fq.gz Q903_11.minimap2.paf.gz"
User time (seconds): 12570.57
System time (seconds): 35462.22
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 13:25:55
Maximum resident set size (kbytes): 251763388
Thanks for the tip about miniasm -Rc2
. I'll try it out.
With -c2
:
[M::main] ===> Step 3: 2-pass (fine) read selection <===
[M::ma_hit_sub::45694.031*0.99] 3926323 query sequences remain after sub
[M::ma_hit_cut::46010.082*0.99] 1976194002 hits remain after cut
With -c3
:
[M::main] ===> Step 3: 2-pass (fine) read selection <===
[M::ma_hit_sub::60693.790*0.99] 3681650 query sequences remain after sub
[M::ma_hit_cut::60946.122*0.99] 2314586291 hits remain after cut
Note the difference between 1976194002 vs 2314586291. I haven't checked the source code, but I guess -c3
failed because the containment removal part might be using 31-bit integers somewhere.
Those pesky signed 31-bit integers. =) I tend to use size_t
for unsigned counters, or ssize_t
if you want it to be signed for some reason. Gives you 64-bit counters without having to resort to uint64_t
.
Hi Liheng, When I use miniasm to assembly an 100x Pacbio genome, there was an error as follow: Segmentation fault What is the matter? I followed the mannual, and the logs were:
Can you give me some advice to solve this problems? Thank you very much!