jgx65 / hierfstat

the hierfstat package
24 stars 14 forks source link

explanation of behavior #44

Closed pdimens closed 3 years ago

pdimens commented 3 years ago

Good evening,

I'm trying to understand how hierfstat calculates Weir-Cockerham pairwise FST and hope you may provide some clarification at what's going on at https://github.com/jgx65/hierfstat/blob/f35069782833e4febd0466c5195f3f8107d8d8c8/R/nwc.R#L66-L69 My understanding is that hetpl creates a list of whether an allele appears in a heterozygous state for an individual at a locus Locus1

Locus2

I'm not clear on what what mho <- is doing. For example, when using the nancycats data provided by adegenet, allele 145 (idx = 14) in locus fca8 (idx = 1) appears in the heterozygous state 9 times:

tmp <- hetpl[[1]][[14]]

which(tmp == TRUE) #9 of them
38 44 53 70 72 87 140 159 187  

I was under the impression that mho https://github.com/jgx65/hierfstat/blob/f35069782833e4febd0466c5195f3f8107d8d8c8/R/nwc.R#L66-L69 (before it gets concatenated) is a list of matrices, where within each matrix the rows correspond to unique alleles and the columns correspond to the unique populations. However, looking at the first matrix (presumably the first locus fca8), the 14th row, which should(?) correspond to allele 14 (145), does not add up to 9:

mho[[1]][14,]
1 0 0 5 0 0 0 3 2 4 1 3 0 0 0 0 0

May you please explain what counts/values are occuring in mho?

jgx65 commented 3 years ago

You are correct, mho here does not correspond to what it is supposed to (you would need to add a byrow=TRUE argument to the matrix command).

But the next line of code restores everything back to normal (e.g. you can check that mho[14,] adds up to 9), by reading correctly the element of this list into a matrix. It is more complicated to recover the correct order of elements otherwise.

Regards

pdimens commented 3 years ago

Thank you for the clarification. If the second mho creates this correct matrix, what do the values in the first mho correspond to then?

jgx65 commented 3 years ago

I leave it to you to work it out (look at the results of

mho<-lapply(hetpl, 
 function(x) matrix(unlist(lapply(x, 
 function(y) tapply(y,pop,sum,na.rm=TRUE))),ncol=np), byrow=TRUE )