samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
633 stars 241 forks source link

How to count number of haplotypes and haplotype frequency from phased vcf files? #2201

Closed Arline13 closed 1 month ago

Arline13 commented 1 month ago

Hello,

I'm pretty new in bioinformatic analysis

I have a phased vcf files that I create with beagle tools in linux and now I want to count number of uniq haplotypes and frequency of haplotypes from my phased vcf files

I tried to find script using whatshap tools, bcftools, beagle but I can't find tutorial

Can someone help me for this?

Thanks in advance

Arline

pd3 commented 1 month ago

We don't have a function to do that. To be honest, I am not even sure what kind of metric to use, ie when can be two overlapping haplotypes considered identical. It sounds like the PBWT algorithm developed by Richard Durbin might be a good place to start.