lbcb-sci / racon

Ultrafast consensus module for raw de novo genome assembly of long uncorrected reads
MIT License
197 stars 34 forks source link

What will happen if draft contains stretches of Ns? #39

Closed ms-gx closed 4 years ago

ms-gx commented 4 years ago

What will happen if I polish a draft genome which contains stretches of Ns? As for example generated by Flye when scaffolding contigs. It is adding 100 Ns in between if I remember correctly.

I would add parameter "-u" but I assume this has influence on my question.

rvaser commented 4 years ago

Hello, if the stretch of Ns is either at the beginning or end of a 500bp window, it might get truncated. My best bet would be to polish your assembly before scaffolding.

Option -u is used when you want all of your contigs in the output file, otherwise those that are not polished at all will be dropped.

Best regards, Robert

ms-gx commented 4 years ago

Thank you for the quick response! Yes, in the meantime I saw that racon truncates those stretches of Ns. There is no way to avoid this I guess?

rvaser commented 4 years ago

Actually, there is the --no-trimming option which does that. Forgot about it :D

ms-gx commented 4 years ago

OK, that sounds promising! So with "--no-trimming" racon will basically just correct regions where it is able to map reads and the others it will leave untouched? I did not really understand this parameter from the help...

rvaser commented 4 years ago

Yes, everything that can be corrected with the reads will be, and nothing shall be trimmed away due to low coverage.

ms-gx commented 4 years ago

OK, many thanks!