jgx65 / hierfstat

the hierfstat package
24 stars 14 forks source link

Negative kinship values with beta.dosage? #73

Closed alexkrohn closed 1 month ago

alexkrohn commented 2 months ago

I'm interested in using beta.dosage to calculate a kinship matrix because it's likely some of the founding individuals in this cohort were related.

Can you explain the values of the kinship matrix? I assumed they would be kinship coefficients, that vary from 0-0.5, e.g. IBD probabilities. However, when I do the actual calculations, I see that some are negative. What might cause these negative values outside of the diagonal?

Reprex from the package example:

set.seed(1234)
dos<-matrix(sample(0:2,size=100,replace=TRUE),ncol=10)
beta.dosage(dos,inb=TRUE)
jgx65 commented 1 month ago

By definition, the mean of the off-diagonal elements of the Beta estimator of kinship is zero, therefore, many kinship estimates will be negative. As we explain here and here, kinship measures are relative to a reference. The reference for the beta estimator is the mean of all kinships in the population. A negative value means the two individuals we are comparing are less related than the average, while a value larger than 0 indicates larger relatedness than the average. Figure 6 of the genetics paper and the accompanying text explains this in some details, and how you could change the reference to make all estimates positive if you'd prefer to (see eq 11)

Hope this helps

jgx65 commented 1 month ago

@alexkrohn , any reason to keep this issue opened?

alexkrohn commented 1 month ago

Hi @jgx65. Thanks for your reply and detailed explanation.