I just wanted to clarify my understanding of the haplotypes produced in Hi-C mode.
Based on the paper and the docs, my interpretation of the hap1/hap2 output files when both HiFi and Hi-C data are used in the assembly process is:
The contigs should be haplotigs, and typically hifiasm will correctly output all the haplotigs that form a chromosome in the same haplotype file
The Hi-C data can phase within chromosomes (e.g. the haplotig example above), but it can't cluster between chromosomes. To do this would require trio data.
Therefore the haplotype files should typically consist of phased contig (i.e haplotigs) sequences that will constitute a chromosome, but the combination of chromosomes within a haplotype file are likely to be a mix of maternal and paternal origin.
e.g. hap1.p_ctg might contain maternal chromosome 1 haplotigs, but paternal chromosome 2 haplotigs etc...
Hi,
I just wanted to clarify my understanding of the haplotypes produced in Hi-C mode.
Based on the paper and the docs, my interpretation of the hap1/hap2 output files when both HiFi and Hi-C data are used in the assembly process is:
hifiasm
will correctly output all the haplotigs that form a chromosome in the same haplotype filehap1.p_ctg
might contain maternal chromosome 1 haplotigs, but paternal chromosome 2 haplotigs etc...Is that roughly correct?
Thanks for the help Al