Closed baozg closed 2 years ago
Hi Zhigui,
My interpretation is that genomic differences between haplotype-resolved assemblies, would be phased by definition as the assemblies are phased.
However, merging that information in a VCF is tricky as it involves linear representation of non-linear rearrangements. We are currently developing methods for that, but it is currently not available.
Best Manish
Hi Manish,
Yes. It's definitely what I am looking for. Hopefully, it can support diploid and autopolyploid. Looking forward to your development.
Hi @baozg,
I have a question and would like to know your opinion. Let's say that you are comparing two diploid genomes, one of which you consider as reference and other as query. Then what do you want in the VCF? In this case, four pair-wise genome comparisons would happen. Do you expect all of them to be in one VCF? The VCF is based on the genomic coordinates of the reference genome, but the reference is diploid so either we create two VCFs (one for each haplotype) or we consider one reference haplotype of as the "true" reference genome and then compare all haplotypes to it. It would be great if you could share your opinions on what would make more sense to you and why.
Best Manish
Hi Manish
Actually, in human they already have some pipeline for phased assemblies (Dipcall, SVIM-asm,https://github.com/EichlerLab/pav). Typically, I will set a haploid as ref (double haploid or inbreeding line) in plant, then use diploid1 (haplotype-resloved assembly ) to call vcf (1|0 / 1|0 / 1|1 / 0/0), then use bcftools merge if I have more than 1 individual.
For population-level assemblies,we need to do all-to-all alignements and then calling variant from all alignments. I do prefer to use graph pangenome to call variants (vg deconstruct)
Thanks Zhigui
Thanks Zhigui for sharing your ideas. This is very helpful.
Hi, @mnshgl0110
Can SyRI expand to phased assembly to generate a phased VCF for variants using haplotype-resloved assemblies?
Thanks Zhigui