Frequency of genotypes in the base population is assumed to follow HWE given the allele frequency (p, q=1-p): [p^2, 2 p q, q^2].
However, with inbreeding in a population we actually expect this kind of distribution: [p^2 + p q F, 2 p q (1 - F), q^2 + p q F] where F is the population (average) inbreeding coefficient.
Does this matter? If we would have highly inbreed individuals, say the distribution is [0.5, 0.0, 0.5], HWE would not represent the genotype frequency well (for the example p=5 and [p^2, 2 p q, q^2]=[0.25,0.50,0.25], but we should have [0.5, 0.0, 0.5]. This seems like a large deviation, though this is extreme.
We could expand the functionality and estimate F, but we often have very limited information about the average/population inbreeding in the base population (where by definition we assume F=0), so likely we will not be able to estimate it. Maybe in cases where we have genomic data or we can make Leggara's metrafounders work make work;)
Frequency of genotypes in the base population is assumed to follow HWE given the allele frequency
(p, q=1-p)
:[p^2, 2 p q, q^2]
.However, with inbreeding in a population we actually expect this kind of distribution:
[p^2 + p q F, 2 p q (1 - F), q^2 + p q F]
whereF
is the population (average) inbreeding coefficient.Does this matter? If we would have highly inbreed individuals, say the distribution is
[0.5, 0.0, 0.5]
, HWE would not represent the genotype frequency well (for the examplep=5
and[p^2, 2 p q, q^2]=[0.25,0.50,0.25]
, but we should have[0.5, 0.0, 0.5]
. This seems like a large deviation, though this is extreme.We could expand the functionality and estimate
F
, but we often have very limited information about the average/population inbreeding in the base population (where by definition we assumeF=0
), so likely we will not be able to estimate it. Maybe in cases where we have genomic data or we can make Leggara's metrafounders work make work;)