lbcb-sci / herro

HERRO is a highly-accurate, haplotype-aware, deep-learning tool for error correction of Nanopore R10.4.1 or R9.4.1 reads (read length of >= 10 kbps is recommended).
Other
136 stars 9 forks source link

How about Ploidy? #23

Open rfinkers opened 2 months ago

rfinkers commented 2 months ago

Hi, I'm wondering if this method also would be applicable to a datasets of polyploid species. Any thoughts towards that direction?

Thanks!

alexweisberg commented 2 months ago

Similar question here, is this program applicable to sequencing data from bacteria, or is the model only designed for human sequence data?

dominikstanojevic commented 2 months ago

Hello Alexandra,

the model was trained exclusively on human data, specifically a few chromosomes from the HG002 dataset. We've observed encouraging results when applying this model to other human samples, including haploid datasets like chm13, as well as to different organisms (figure below). This versatility suggests that it's definitely worth attempting to run the model on your data.

herro_qv

Just make sure that the read length is OK (>= 10 kbps maybe you can push it to 8 kbps).

If you decide to use this tool, we would appreciate your feedback, especially if the results are unsatisfactory. This will help us identify any issues and broaden the model's applicability.

Best regards, Dominik