chhylp123 / hifiasm

Hifiasm: a haplotype-resolved assembler for accurate Hifi reads
MIT License
540 stars 87 forks source link

Is purge_dups meaningful for hic phasing assembly? #525

Open lizihe21 opened 1 year ago

lizihe21 commented 1 year ago

Hi, thanks for your hic integration work to make phasing much easier especially in animal genome assembly. I have run a number of mamal genome assemblies under this mode. Most of the assemblies have a size over the value from kmer evaluation. I wonder how does purge_dups function work in this mode? And is it meaningful to run purge_dups additionally for each hap assemly?

chhylp123 commented 1 year ago

It would be better to not run purge_dups for the Hi-C phased assemblies. K-mer evaluation often underestimate the genome size.