ruanjue / wtdbg2

Redbean: A fuzzy Bruijn graph approach to long noisy reads assembly
GNU General Public License v3.0
510 stars 94 forks source link

Is it necessary to further run consensus tools on the results of wtdbg or smartdenovo? #5

Closed YiweiNiu closed 6 years ago

YiweiNiu commented 6 years ago

Hi Jue,

I'm sorry to bother you once anain.

I found a evaluation paper and it says (paragraph 11 of "Discussion"):

...Wtdbg assemblies, which always ranked last, mostly because no consensus procedure was executed, would need additional rounds of consensus polishing to effectively compete with other assemblers.

So I'm wondering that if it is necessary to further run consensus tools after running wtdbg1.1.006, wtdbg1.2.8 and smartdenovo now, such as Racon? I know all three tools have consensus modules, and have been updated since this paper was published.

I'm working on a de novo genome assemlby project and there are very limited genomic resources to evaluate the correctness. Except PacBio data, I also have several short reads libraries, so I want to perform scaffolding based on results of wtdbg. I don't know how the errors in contigs affect the scaffolding.

Any suggestions or thoughts would be appreciated. Thank you!

Bests, Yiwei Niu

ruanjue commented 6 years ago

Thanks for interesting.

The inside consensus tool wtdbg-cns aims to provide a quick way to reduce sequencing errors. It is suggested to use Quiver and/or Pilon to polish the consensus sequences after you feel happy with the assembly. Usually, wtdbg-cns can reduce error rate down to less than 1%, which can be well-aligned by short reads.

BTW, the paper used strage parameters for wtdbg, limited to say wtdbg always ranked last, I had emailed the author just after its publish.

YiweiNiu commented 6 years ago

Thank you for your reply!

Good to know the error rate of the wtdbg-cns results.

BTW, I know wtdbg and smartdenovo through this paper. After trying several pipelines (Canu, MECAT, miniasm, MaSuRCA, Flye etc.), wtdbg is the fastest and the best (at least the most satisfying N50) so far. I will use the results of wtdbg for downstream analysis.

Thank you for your great tools again!

ruanjue commented 6 years ago

Have a good luck!