Open bostanict opened 7 years ago
The haplotigs associated with each primary contig represent a mixture of the two haplotypes, and are not phased relative to each other. The haplotigs are basically structural variation that was found in the assembly graph, and subsequently "snipped out". In a denovo assembly context, there is not enough information present to properly phase unlinked contigs. Moreover, each primary contig is only "partially phased" meaning there are "contiguous phased blocks", but one contig may be comprised of multiple "phased blocks".
I suggest you read the supplementary methods of the falcon_unzip paper a little bit closer.
Dear Greg,
Thanks for the comment.
Best,
Hi fellow,
Reading the unzip paper, it seems that the unzip has an strategy to phase the alternative haplotigs in a correct haplotype according to each other within a primary contig, right?
In other words, within a primary contig, are all the associated haplotigs phased correctly relative to each other? So assuming P contig as Haplotype A, are all the haplotigs labeled as associated contigs really from Haplotype B? or they can be mixed up in some regions when the overlapping sequence is not divergent enough to make the fork and the following contigs after that are not 100% correctly phased?
Thanks,