broadinstitute / pilon

Pilon is an automated genome assembly improvement and variant detection tool
GNU General Public License v2.0
341 stars 60 forks source link

About the N in result? #10

Closed yilunhuangyue closed 7 years ago

yilunhuangyue commented 8 years ago

I have used pilon to polish a assembly result(contigs,which do not contains N), but after using the pilon, there are some Ns in my assembly result. This means the N50 of contigs is smaller than before. What does these Ns means? Is there any method to cut these Ns? Thanks so much!

w1bw commented 8 years ago

Pilon should not introduce new gaps unless the "--fix breaks" option is turned on. It is off by default, but is turned on as part of the "--variant" option.

When this happens, Pilon could not verify the contiguity of the region and attempted to reassemble the region. It make progress, but was unable to find a closed solution so it introduces a gap (probably 10 Ns).

I don't normally recommend this option for assembly improvement (if you're conservative), but it is important for variant calling to be able to identify large indel events even when it can't put the whole thing together.