Open Yegor13 opened 3 months ago
Thanks for your question, sorry it's taken me a while to get to.
It's because of the way the functions are written, they assume all the data is for a single locus when adding unmeasured alleles, summing allele frequencies, and sample sizes. These nothing stopping you from calculating the allele frequencies of different loci separately.
However, if you're interested in calculating linkage disequilibrium or haploid frequencies you have to be more careful. You cannot calculate the frequency of people with HLA-A01 and HLA-B01 from the frequencies of these alleles separately because there is strong linkage disequilibirum between HLA loci. You have to use studies that have measured HLA-A and HLA-B in the same individuals.
Since allelefrequencies.net do provide access to haploid data it would be possible to extend HLAfreq to estimate haplotypes and linkage disequilibrium too. But that would depend on interest from users.
Does that answer your question?
Hello,
Thank you very much for a nice package!
My question is: can one use allelefrequencies data to calculate relative usage of different loci in a population or ethnic group? Does it actually make sense? In this notebook https://github.com/BarinthusBio/HLAfreq/blob/main/examples/single_country.ipynb it’s mentioned that frequency estimates can be combined only for a single loci at a time, but why?