ruanjue / wtdbg2

Redbean: A fuzzy Bruijn graph approach to long noisy reads assembly
GNU General Public License v3.0
513 stars 94 forks source link

Genomesize #76

Closed trichoptera closed 5 years ago

trichoptera commented 5 years ago

Hi,

I have a question regarding the -g option (estimated genome size) in wtdbg2. How important is it that -g is set correctly or how important is this option at all? I run wtdbg2 on nanopore data with and without -g 600m and the assembly statistics were different [for example total length of assembly (>= 0 bp) 385942422 (without -g) vs. 391227614 (with -g 600m); # of contigs: 1702 vs. 1680; N50 : 438440 vs. 821380; L50 : 263 vs. 153].

Thanks

ruanjue commented 5 years ago

genome size is used to selecting about 50X reads (-X 50), or change the -e 3 to -e 2 when there is too low coverage. If you play with the wtdbg2's parameters very well, you can ignore -g, -x, which is designed for beginners.

trichoptera commented 5 years ago

Thank you for your answer!