bcgsc / ntEdit

✏️ Genome assembly polishing & SNV detection
GNU General Public License v3.0
64 stars 9 forks source link

How much does a short read assembly benefit from ntEdit #19

Closed ms-gx closed 3 years ago

ms-gx commented 3 years ago

In your experience, if I have a short-read assembly (Illumina PE150bp) assembled with ABySS: How much does the assembly in general profit from polishing with ntEdit? Would you recommend it? Why/Why not?

I will try on my assemblies anyway, but would like to hear your experience.

warrenlr commented 3 years ago

Thank you for your message and interest in ntedit.

Marginal benefit, if any*. If you do, I would recommend running with two Bloom filters, the second one comprising repeat kmers to exclude (please consult guidelines on how to select proper k with specific jump j values). That's because ntEdit operates with less context than alignment-based polishers like Pilon or Racon and some underrepresented kmers may be incorrectly changed for abundant (repeated) ones. I would also advise you to monitor the changes closely, say with BUSCO or another means to evaluate the resulting assemblies.

ms-gx commented 3 years ago

Thanks for your comment, this is really helpful!

Yes, we are always using BUSCO, N50, mapping rate or raw data and whatever we can use to asses the quality of the assembly :)