voutcn / megahit

Ultra-fast and memory-efficient (meta-)genome assembler
http://www.ncbi.nlm.nih.gov/pubmed/25609793
GNU General Public License v3.0
596 stars 135 forks source link

[1.2.8] Error during assembly phase #234

Closed sjaenick closed 5 years ago

sjaenick commented 5 years ago

A medium-sized assembly just aborted during the assembly phase (twice), megahit 1.2.8; last entries from the log file:

2019-08-28 20:03:55 - INFO  sorting/seq_to_sdbg.cpp       :  793 - Number of $ A C G T A- C- G- T-:
2019-08-28 20:03:55 - INFO  sorting/seq_to_sdbg.cpp       :  794 - 529869854 6505201606 10955771898 10922131276 6553823060 323896398 867983064 888610938 300617847
2019-08-28 20:03:55 - INFO  sorting/seq_to_sdbg.cpp       :  800 - Total number of edges: 37847905941
2019-08-28 20:03:55 - INFO  sorting/seq_to_sdbg.cpp       :  801 - Total number of ONEs: 34936927840
2019-08-28 20:03:55 - INFO  sorting/seq_to_sdbg.cpp       :  803 - Total number of $v edges: 529869854
2019-08-28 20:03:55 - INFO  sorting/base_engine.cpp       :  210 - Postprocess done. Time elapsed: 0.0468
2019-08-28 20:04:12 - INFO  utils/utils.h                 :  152 - Real: 12523.2730     user: 295278.4131       sys: 1028.5337  maxrss: 193903512
2019-08-28 20:04:25 - Assemble contigs from SdBG for k = 21
2019-08-28 20:04:25 - command /ceph/mgx-sw/bin/megahit_core assemble -s /ceph/sge-tmp/MGX2_devel/10/a20ee79fd94912d2a52bbf3a42359c45/megahit_out/tmp/k21/21 -o /ceph/sge-tmp/MGX2_devel/10/a20ee79fd94912d2a52bbf3a42359c45/megahit_out/intermediate_contigs/k21 -t 39 --min_standalone 1000 --prune_level 2 --merge_len 20 --merge_similar 0.95 --cleaning_rounds 5 --disconnect_ratio 0.1 --low_local_ratio 0.2 --cleaning_rounds 5 --min_depth 2 --bubble_level 2 --max_tip_len -1 --careful_bubble
2019-08-28 20:13:29 - INFO  main_assemble.cpp             :  129 - Loading succinct de Bruijn graph: /ceph/sge-tmp/MGX2_devel/10/a20ee79fd94912d2a52bbf3a42359c45/megahit_out/tmp/k21/21Done. Time elapsed: 544.494202
2019-08-28 20:13:29 - INFO  main_assemble.cpp             :  133 - Number of Edges: 37847905941; K value: 21
2019-08-28 20:13:29 - INFO  main_assemble.cpp             :  140 - Number of CPU threads: 39
2019-08-28 20:16:41 - INFO  assembly/sdbg_pruning.cpp     :  160 - Removing tips with length less than 2; Accumulated tips removed: 159584137; time elapsed: 32.8823
2019-08-28 20:17:44 - INFO  assembly/sdbg_pruning.cpp     :  160 - Removing tips with length less than 4; Accumulated tips removed: 270250007; time elapsed: 62.9552
2019-08-28 20:20:03 - INFO  assembly/sdbg_pruning.cpp     :  160 - Removing tips with length less than 8; Accumulated tips removed: 353051795; time elapsed: 138.7500
2019-08-28 20:25:40 - INFO  assembly/sdbg_pruning.cpp     :  160 - Removing tips with length less than 16; Accumulated tips removed: 427543988; time elapsed: 337.2595
2019-08-28 20:34:14 - INFO  assembly/sdbg_pruning.cpp     :  160 - Removing tips with length less than 32; Accumulated tips removed: 486940419; time elapsed: 514.1360
2019-08-28 20:42:06 - INFO  assembly/sdbg_pruning.cpp     :  169 - Removing tips with length less than 42; Accumulated tips removed: 505295443; time elapsed: 472.3857
2019-08-28 20:42:07 - INFO  main_assemble.cpp             :  158 - Tips removal done! Time elapsed(sec): 1717.534
2019-08-29 06:34:17 - INFO  assembly/unitig_graph.cpp     :   84 - Graph size without loops: 2485161669, palindrome: 214170
2019-08-29 07:09:57 - INFO  main_assemble.cpp             :  167 - unitig graph size: 2485238965, time for building: 37670.450
2019-08-29 11:52:09 - INFO  assembly/contig_stat.h        :   40 - Max: 3092, Min: 22, N50: 24, number contigs: 2485238965, number isolated: 4734484, number looped: 77296, total size: 68834473749,
2019-08-29 11:52:09 - [ERROR] Cannot open /ceph/sge-tmp/MGX2_devel/10/a20ee79fd94912d2a52bbf3a42359c45/megahit_out/intermediate_contigs/k21.bubble_seq.fa. Now exit to system...
2019-08-29 11:52:52 - Error occurs, please refer to /ceph/sge-tmp/MGX2_devel/10/a20ee79fd94912d2a52bbf3a42359c45/megahit_out/log for detail
2019-08-29 11:52:52 - Command: /ceph/mgx-sw/bin/megahit_core assemble -s /ceph/sge-tmp/MGX2_devel/10/a20ee79fd94912d2a52bbf3a42359c45/megahit_out/tmp/k21/21 -o /ceph/sge-tmp/MGX2_devel/10/a20ee79fd94912d2a52bbf3a42359c45/megahit_out/intermediate_contigs/k21 -t 39 --min_standalone 1000 --prune_level 2 --merge_len 20 --merge_similar 0.95 --cleaning_rounds 5 --disconnect_ratio 0.1 --low_local_ratio 0.2 --cleaning_rounds 5 --min_depth 2 --bubble_level 2 --max_tip_len -1 --careful_bubble; Exit code 255

There's more than enough system memory and disk space available. Any ideas?

voutcn commented 5 years ago

The program failed to open the file /ceph/sge-tmp/MGX2_devel/10/a20ee79fd94912d2a52bbf3a42359c45/megahit_out/intermediate_contigs/k21.bubble_seq.fa for writing.

Did the first failure happen when trying to open this file?

BTW, there are 37G k-mers and more than 2G unitigs according to the log, which means that this metagenome is very complex or contains too many erroneous k-mers. I suggest using a larger k_min if you want to get it done faster.

sjaenick commented 5 years ago

Yes, the error occurred for the very same file both times. I added some debugging code (mostly strerror(errno)) to the xfopen() function, but this time it didn't fail for k=21 - assembly is still running, I'll report back later.

sjaenick commented 5 years ago

Assembly completed on the third attempt. Closing, I'll reopen with more information if it occurs again.