marbl / verkko

Telomere-to-telomere assembly of accurate long reads (PacBio HiFi, Oxford Nanopore Duplex, HERRO corrected Oxford Nanopore Simplex) and Oxford Nanopore ultra-long reads.
289 stars 29 forks source link

Help doc clarification #256

Closed dmacguigan closed 4 months ago

dmacguigan commented 4 months ago

Hello,

In the "Getting Started" section of your GitHub page, you state:

For HERRO corrected reads, provide the corrected reads with the --hifi option and the uncorrected reads as --nano

I'm a bit confused by the wording. Does this mean the user supplies the same uncorrected read dataset with --nano that was corrected with HERRO? Doesn't that mean you're effectively double counting each read?

Thanks for your help, Dan

skoren commented 4 months ago

Yes, that's right, you'd supply the reads before correction to --nano.

The nano and hifi reads are used in different parts of the resolution for verkko so they wouldn't get double-counted. The nano reads are used to extend phasing and fill gaps after the hifi reads. For this, we prefer the original reads in case there were any phase switches introduced during correction or if some reads were shortened/removed and created a gap. This is done on the hifi-simplified graph and any reads fully contained in a node are ignored. The nano reads are only used for consensus where they have filled in a gap.

dmacguigan commented 4 months ago

Thanks for the clarification!