Open sivico26 opened 10 months ago
Hi,
That's unexpected, I think it probably represents an error in hapdup rather that something meaningful.. It is likely some kind of an edge case, where phasing block boundary is very close to contig end, but coordinates shifted slightly in different haplotypes. As a result, hapdup split contig_16 in HP1, but not in HP2.
In dual assembly mode this should not happen, but for the phasing mode I'll try to fix that in the future releases.
All right, if you need some data to debug this, let me know.
I am wondering, if the assembly has some redundancy, do you think it could lead/facilitate this problem? I am working with Flye assemblies, but I have not checked if there is redundancy on those.
How big is your dataset? If you could send it somehow, that would be helpful! Feel free to email mikolmogorov@gmail.com
I don't think this is specific to the genome, just a borderline case.
Hi @fenderglass,
Thanks for developing
Hapdup
. I am trying to phase some loci of an allopolyploid plant into what should be the 2 subgenomes of its parents. After checking the output, I have some questions. I will use one of the assemblies as an example.For one of my locus if I look into the
hapdup_phased_*
assemblies, I can see the following names forhap1
:While for
hap2
it is:As you can see, most of the contigs have their homolog in both haplotypes (contigs 7, 10, 14, and 23). But there are other two categories that confuse: