marbl / verkko

Telomere-to-telomere assembly of accurate long reads (PacBio HiFi, Oxford Nanopore Duplex, HERRO corrected Oxford Nanopore Simplex) and Oxford Nanopore ultra-long reads.
304 stars 29 forks source link

Herro dropped reads #290

Closed fergsc closed 1 month ago

fergsc commented 1 month ago

Hi,

I'm experimenting with different assembly strategies for some primates. The reads that herro fails to correct, repetitive reads, are dropped. Has anyone tried to add these reads back into the herro readset and pass them to verkko under the --hifi paramater? I'm conerned about the lack of coverage over hard to assemble, repetitive regions and the effects that herro dropping reads will have.

Thanks.

skoren commented 1 month ago

Not sure how you'd add them since they aren't high enough quality to be used as HiFi inputs. We suggest providing the uncorrected ONT data to verkko as well so it can be used for repeat resolution and gap filling. That should address most smaller dropouts but certainly not very large dropouts that can't be spanned.

That said, on human data we haven't seen significant dropout in HERRO corrected assemblies, the graph looks much more complete than one from HiFi+ONT data. If you have regions that are being dropped and not corrected, it's probably best to report them to the HERRO developers to see if they can be corrected/recovered.

fergsc commented 1 month ago

Adding them in a HiFi inputs was what I was wonering about. Good to know that I shouldn't spend any compute on this. I will proceed with Herro+HiFi and dorado as suggested.

Thanks.