voutcn / megahit

Ultra-fast and memory-efficient (meta-)genome assembler
http://www.ncbi.nlm.nih.gov/pubmed/25609793
GNU General Public License v3.0
588 stars 134 forks source link

Error in assembly. Error occurs after *Assemble contigs from SdBG for k = 21* #315

Open hannah-doris opened 2 years ago

hannah-doris commented 2 years ago

I am currently trying to assembly contigs from a metagenome that contains one eukaryotic organisms and many prokayotic organisms. I am trying to run the assembly using Megahit v1.2.8. The command that I am usings is:

_megahit -1 forward.fastq -2 reverse.fastq --min-contig-len 1000 -m 0.85 -o 01assembly

The slurm file looks like this:

_Starting job 3899359 on c3-30 at Tue Oct 12 16:08:04 CEST 2021

2021-10-12 16:08:08 - MEGAHIT v1.2.8 2021-10-12 16:08:08 - Using megahit_core with POPCNT and BMI2 support 2021-10-12 16:08:08 - Convert reads to binary library 2021-10-12 16:12:10 - b'INFO sequence/io/sequence_lib.cpp : 77 - Lib 0 (/cluster/projects/nn9800k/Hannah_Doris/MetaG_MayJune2021/X204SC21062815-Z01-F001/raw_data/D1060_Bac/D1060_Bac_FDSW210221551-1r_H73WKDSX2_L3_1.fq.gz,/cluster/projects/nn9800k/Hannah_Doris/MetaG_MayJune2021/X204SC21062815-Z01-F001/raw_data/D1060_Bac/D1060_Bac_FDSW210221551-1r_H73WKDSX2_L3_2.fq.gz): pe, 157668164 reads, 150 max length' 2021-10-12 16:12:10 - b'INFO utils/utils.h : 152 - Real: 241.8901\tuser: 155.9623\tsys: 15.8497\tmaxrss: 258020' 2021-10-12 16:12:10 - k-max reset to: 141 2021-10-12 16:12:10 - Start assembly. Number of CPU threads 80 2021-10-12 16:12:10 - k list: 21,29,39,59,79,99,119,141 2021-10-12 16:12:10 - Memory used: 344435638476 2021-10-12 16:12:10 - Extract solid (k+1)-mers for k = 21 2021-10-12 16:29:07 - Build graph for k = 21 2021-10-12 16:35:03 - Assemble contigs from SdBG for k = 21 2021-10-12 16:56:51 - Error occurs, please refer to /cluster/projects/nn9800k/Hannah_Doris/MetaG_MayJune2021/X204SC21062815-Z01-F001/raw_data/D1060_Bac/01_Ass_Megahit_D1060_Bac/log for detail 2021-10-12 16:56:51 - Command: /cluster/software/MEGAHIT/1.2.8-GCCcore-8.2.0/bin/megahit_core assemble -s /cluster/projects/nn9800k/Hannah_Doris/MetaG_MayJune2021/X204SC21062815-Z01-F001/raw_data/D1060_Bac/01_Ass_Megahit_D1060_Bac/tmp/k21/21 -o /cluster/projects/nn9800k/Hannah_Doris/MetaG_MayJune2021/X204SC21062815-Z01-F001/raw_data/D1060_Bac/01_Ass_Megahit_D1060_Bac/intermediate_contigs/k21 -t 80 --min_standalone 1000 --prune_level 2 --merge_len 20 --merge_similar 0.95 --cleaning_rounds 5 --disconnect_ratio 0.1 --low_local_ratio 0.2 --cleaning_rounds 5 --min_depth 2 --bubble_level 2 --max_tip_len -1 --careful_bubble; Exit code -6

Task and CPU usage stats: JobID JobName AllocCPUS NTasks MinCPU MinCPUTask AveCPU Elapsed ExitCode


3899359 Megahit_A+ 8 00:48:49 0:0 3899359.bat+ batch 8 1 10:15:04 0 10:15:04 00:48:49 0:0 3899359.ext+ extern 8 1 00:00:00 0 00:00:00 00:48:49 0:0

Memory usage stats: JobID MaxRSS MaxRSSTask AveRSS MaxPages MaxPagesTask AvePages


3899359 3899359.bat+ 20386760K 0 20386760K 0 0 0 3899359.ext+ 0 0 0 0 0 0

Disk usage stats: JobID MaxDiskRead MaxDiskReadTask AveDiskRead MaxDiskWrite MaxDiskWriteTask AveDiskWrite


3899359 3899359.bat+ 5.59M 0 5.59M 0.02M 0 0.02M 3899359.ext+ 0.00M 0 0.00M 0 0 0

Job 3899359 completed at Tue Oct 12 16:56:52 CEST 2021_

When I look in the log file to see where the error occured I looked for anything in the log file that occured after 2021-10-12 16:35:03 since that is the last time anything occured that went correctly in the slurm file.

_2021-10-12 16:35:03 - Assemble contigs from SdBG for k = 21 2021-10-12 16:35:03 - command /cluster/software/MEGAHIT/1.2.8-GCCcore-8.2.0/bin/megahit_core assemble -s /cluster/projects/nn9800k/Hannah_Doris/MetaG_MayJune2021/X204SC21062815-Z01-F001/raw_data/D1060_Bac/01_Ass_Megahit_D1060_Bac/tmp/k21/21 -o /cluster/projects/nn9800k/Hannah_Doris/MetaG_MayJune2021/X204SC21062815-Z01-F001/raw_data/D1060_Bac/01_Ass_Megahit_D1060_Bac/intermediate_contigs/k21 -t 80 --min_standalone 1000 --prune_level 2 --merge_len 20 --merge_similar 0.95 --cleaning_rounds 5 --disconnect_ratio 0.1 --low_local_ratio 0.2 --cleaning_rounds 5 --min_depth 2 --bubble_level 2 --max_tip_len -1 --careful_bubble 2021-10-12 16:35:50 - b'INFO main_assemble.cpp : 129 - Loading succinct de Bruijn graph: /cluster/projects/nn9800k/Hannah_Doris/MetaG_MayJune2021/X204SC21062815-Z01-F001/raw_data/D1060_Bac/01_Ass_Megahit_D1060_Bac/tmp/k21/21Done. Time elapsed: 47.636295' 2021-10-12 16:35:50 - b'INFO main_assemble.cpp : 133 - Number of Edges: 2082005199; K value: 21' 2021-10-12 16:35:50 - b'INFO main_assemble.cpp : 140 - Number of CPU threads: 80' 2021-10-12 16:36:20 - b'INFO assembly/sdbg_pruning.cpp : 160 - Removing tips with length less than 2; Accumulated tips removed: 1545194; time elapsed: 2.0835' 2021-10-12 16:36:23 - b'INFO assembly/sdbg_pruning.cpp : 160 - Removing tips with length less than 4; Accumulated tips removed: 3381612; time elapsed: 2.9032' 2021-10-12 16:36:28 - b'INFO assembly/sdbg_pruning.cpp : 160 - Removing tips with length less than 8; Accumulated tips removed: 5415651; time elapsed: 4.5718' 2021-10-12 16:36:34 - b'INFO assembly/sdbg_pruning.cpp : 160 - Removing tips with length less than 16; Accumulated tips removed: 7566431; time elapsed: 6.6341' 2021-10-12 16:36:42 - b'INFO assembly/sdbg_pruning.cpp : 160 - Removing tips with length less than 32; Accumulated tips removed: 9033760; time elapsed: 7.7706' 2021-10-12 16:36:49 - b'INFO assembly/sdbg_pruning.cpp : 169 - Removing tips with length less than 42; Accumulated tips removed: 9315469; time elapsed: 6.8051' 2021-10-12 16:36:49 - b'INFO main_assemble.cpp : 158 - Tips removal done! Time elapsed(sec): 58.471' 2021-10-12 16:47:54 - b'INFO assembly/unitig_graph.cpp : 84 - Graph size without loops: 94538080, palindrome: 4168' 2021-10-12 16:48:12 - b'INFO main_assemble.cpp : 167 - unitig graph size: 94539434, time for building: 682.811' 2021-10-12 16:54:20 - b'INFO assembly/contig_stat.h : 40 - Max: 5135, Min: 22, N50: 29, number contigs: 94539434, number isolated: 166596, number looped: 1354, total size: 2977830123,' 2021-10-12 16:54:20 - b'INFO main_assemble.cpp : 184 - Graph cleaning round 1' 2021-10-12 16:54:50 - b'INFO main_assemble.cpp : 201 - Number of bubbles removed: 492766, Time elapsed(sec): 29.689' 2021-10-12 16:55:18 - b'INFO main_assemble.cpp : 211 - Number of complex bubbles removed: 91460, Time elapsed(sec): 27.939709' 2021-10-12 16:56:39 - b"megahit_core: /cluster/work/users/vegarde/build/MEGAHIT/1.2.8/GCCcore-8.2.0/megahit-1.2.8/src/assembly/unitig_graph.cpp:315: void UnitigGraph::Refresh(bool): Assertion `!(next_adapter.GetFlag() & kDeleted)' failed." 2021-10-12 16:56:51 - Error occurs, please refer to /cluster/projects/nn9800k/Hannah_Doris/MetaG_MayJune2021/X204SC21062815-Z01-F001/raw_data/D1060_Bac/01_Ass_Megahit_D1060_Bac/log for detail 2021-10-12 16:56:51 - Command: /cluster/software/MEGAHIT/1.2.8-GCCcore-8.2.0/bin/megahit_core assemble -s /cluster/projects/nn9800k/Hannah_Doris/MetaG_MayJune2021/X204SC21062815-Z01-F001/raw_data/D1060_Bac/01_Ass_Megahit_D1060_Bac/tmp/k21/21 -o /cluster/projects/nn9800k/Hannah_Doris/MetaG_MayJune2021/X204SC21062815-Z01-F001/raw_data/D1060_Bac/01_Ass_Megahit_D1060_Bac/intermediate_contigs/k21 -t 80 --min_standalone 1000 --prune_level 2 --merge_len 20 --merge_similar 0.95 --cleaning_rounds 5 --disconnect_ratio 0.1 --low_local_ratio 0.2 --cleaning_rounds 5 --min_depth 2 --bubble_level 2 --max_tip_len -1 --careful_bubble; Exit code -6_

It seems that everything that went wrong actually occured at 16:56:39 (which I have bolded above) although, I am not completely sure what the error means. Any information would be greatly appreciated. Thanks!

Hannah

KJ-Ma commented 1 year ago

I met with the same wrong when I using megahit, I think it was because " -t 80", which may result in a vast number of temp files. I try to use "-t 8" and run my jobs successufully.