jgx65 / hierfstat

the hierfstat package
24 stars 14 forks source link

Inconsistency in genet.dist function #48

Closed akamolphat closed 3 years ago

akamolphat commented 3 years ago

There are some inconsistencies with the genet.dist function. The result from executing the function with a genind object is not the same as with a dataframe. Please see an example below:

library(adegenet)
library(hierfstat)
data("nancycats")
a <- genet.dist(nancycats, method = "WC84")
b <- genet.dist(as.data.frame(cbind(locality = nancycats$pop, nancycats$tab)), diploid = T, method = "WC84")

Note that a and b are not the same. I think it has something to do with the genind2hierfstat function which was used in the genet.dist function to automatically convert genind object to a dataframe.

nancy <- genind2hierfstat(nancycats)
nancy2 <- as.data.frame(cbind(locality = nancycats$pop, nancycats$tab))

a2 <- genet.dist(nancy, method = "WC84")
b2 <- genet.dist(nancy2, method = "WC84")

I understand that the function does not mention using genind object but there are tutorials out there that suggests that genind object can be used (e.g. https://tomjenkins.netlify.app/2020/09/21/r-popgen-getting-started/#5). I thought it would be good to point this out anyways as the function executes with no warnings when a genind object is used.

jgx65 commented 3 years ago

To convert a genind object to hierfstat format, the genind2hierfstat function is the one to use. nancy2 is not in the correct hierfstat format. to get the Weir & Cockerham pairwise Fsts, the results in a or a2 in your example are correct, not those in b and b2.

jgx65 commented 3 years ago

Note that you may use the fs.dosage function on nancy2 , and its component Fst2x2. This is not exactly the same as pairwise WC84 (unless sample sizes are identical), but we (Weir and Goudet 2017) recommend using this new estimator when the assumption of independence of populations cannot be made (which most often is the case)

akamolphat commented 3 years ago

Hi Jerome,

Thank you :)

Yours sincerely, Kamolphat Atsawawaranunt

On Sat, May 8, 2021 at 8:00 PM Jerome Goudet @.***> wrote:

Note that you may use the fs.dosage function on nancy2 , and its component Fst2x2. This is not exactly the same as pairwise WC84 (unless sample sizes are identical), but we (Weir and Goudet 2017) recommend using this new estimator when the assumption of independence of populations cannot be made (which is mots often the case)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jgx65/hierfstat/issues/48#issuecomment-835352725, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMW6UQILMFNGBKFOBEXWOLTTMUYWBANCNFSM44JXQ23Q .