c-zhou / yahs

Yet another Hi-C scaffolding tool
MIT License
131 stars 19 forks source link

Are there any requirements for the input genome assembly results? #48

Closed XXH123a closed 1 year ago

XXH123a commented 1 year ago

Can yahs be used for prefix.p_utg.gfa assembled with Hifiasm or low heterozygosity (0.17%) genomes assembled with nextdenovo? Looking forward to your reply, thank you very much!

c-zhou commented 1 year ago

Hello @XXH123a,

YaHS is not haplotype-aware, i.e., for a diploid, if two haplotypes are both presented, the two homologous chromosomes are tended to be mixed together. So typically, only one copy of the chromosome should be retained in the input contigs.

In your case, the genome has very low heterozygosity, it might behave more like a haploid, i.e., two haplotypes were collapsed by Hifiasm and not much haplotypic duplication remained in the assembly. It is probably fine to be used as the input.

Best, Chenxi

andreaschavez commented 1 year ago

I'm following up on the previous question. For a diploid species (mammal), is it better to run YAHS with one of the two fully phased haplotype assemblies from HiFiASM (e.g., prefix.dip.hap1.p_ctg.fa or prefix.dip.hap2.p_ctg.fa) or the complete assembly with long stretches of phased blocks (prefix.p_ctg.fa)?