emmanuelparadis / pegas

Population and Evolutionary Genetics Analysis System
GNU General Public License v2.0
27 stars 10 forks source link

Feature request: Calculate phi statistics for AMOVA #25

Closed zkamvar closed 6 years ago

zkamvar commented 6 years ago

I must admit that there are a lot of people who use the ade4 implementation of amova and it's partially my fault. One of the things that appealing about it is the fact that it calculates phi values automatically. I realized that the calculation for this is nicely automated in ade4 and I was wondering if it was possible to add these values to the output of amova:

https://github.com/sdray/ade4/blob/2ee45cb64071c0023688effcadf1ff71f259141c/R/amova.R#L149-L163

    Statphi <- function(sigma) {
        # Phi-statistics.
        f <- rep(0, length(sigma) - 1)
        if (length(sigma) == 3) {
            f <- rep(0, 1)
        }
        f[1] <- (sigma[length(sigma)] - sigma[length(sigma) - 1]) / sigma[length(sigma)]
        if (length(f) > 1) {
            s1 <- cumsum(sigma[(length(sigma) - 1):2])[-1]
            s2 <- sigma[(length(sigma) - 2):2]
            f[length(f)] <- sigma[1] / sigma[length(sigma)]        
            f[2:(length(f) - 1)] <- s2 / s1
        }
        return(f)
    }        
emmanuelparadis commented 6 years ago

I'll have a look at it soon. This will need to be adapted because pegas::amova accepts any number of levels.

zkamvar commented 6 years ago

From what I can tell from the script, it already does calculate it for multiple levels.

On Nov 30, 2017, at 17:41 , Emmanuel Paradis notifications@github.com wrote:

I'll have a look at it soon. This will need to be adapted because pegas::amova accepts any number of levels.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/emmanuelparadis/pegas/issues/25#issuecomment-348356965, or mute the thread https://github.com/notifications/unsubscribe-auth/ADeIliFFbirctYP4apRbB5pQCA1Cqs9_ks5s7z0vgaJpZM4QvohO.

emmanuelparadis commented 6 years ago

Done for the case with 2 levels using Excoffier et al.'s original formula. Just pushed version 0.10-3.

zkamvar commented 6 years ago

Correct me if I'm wrong, but the phi statistics are calculated in a similar way as F statistics, no? That means that it's possible to use the same strategy from @jgx65's code in varcomp.glob() to calculate these stats for all levels of the hierarchy.


ade4phi <- function(sig){
  n <- length(sig)
  f <- rep(0, n - 1)
  if (length(sig) == 3) {
    f <- rep(0, 1)
  }
  f[1] <- (sig[n] - sig[n - 1]) / sig[n]
  if (length(f) > 1) {
    s1 <- cumsum(sig[(n - 1):2])[-1]
    s2 <- sig[(n - 2):2]
    f[length(f)] <- sig[1] / sig[n]
    f[2:(length(f) - 1)] <- s2 / s1
  }
  return(f)
}

hfsphi <- function(sig){
  tot      <- sig
  nblevels <- length(tot)
  f <- matrix(rep(0, (nblevels - 1)^2), 
              ncol = (nblevels - 1))
  for (i in 1:(nblevels - 1)) {
    for (j in i:(nblevels - 1)) {
      f[i, j] <- sum(tot[i:j])/sum(tot[i:nblevels])
    }
  }
  f
}
# adapted from the ade4 amova code
Statphi.default <- function(sig, method = "ade4") {
  method <- match.arg(method, c("ade4", "hierfstat"))
  if (method == "ade4") ade4phi(sig) else hfsphi(sig[-length(sig)])
}

is_pegas_amova <- function(sig) identical(names(sig), c("tab", "varcoef", "varcomp", "call"))

Statphi.amova <- function(sig, ...){
  if (is_pegas_amova(sig)){
    if (is.data.frame(sig$varcomp)){
      siggies <- sig$varcomp$sigma2
    } else {
      siggies <- sig$varcomp
    }
    Statphi(c(siggies, sum(siggies)), ...)
  } else {
    Statphi(sig$componentsofcovariance$Sigma, ...)
  }
}

Statphi <- function(sig, ...){
  UseMethod("Statphi")
}

library("poppr")
library("pegas")
library("hierfstat")
data(nancycats)
strata(nancycats) <- data.frame(pop = pop(nancycats))
a <- poppr.amova(nancycats, ~pop, method = "pegas", nperm = 0)
Statphi(a)
#> [1] 0.20563714 0.13469696 0.08198305
Statphi(a, "h")
#>            [,1]      [,2]
#> [1,] 0.08198305 0.2056371
#> [2,] 0.00000000 0.1346970

data(microbov)
strata(microbov) <- data.frame(other(microbov))
b <- poppr.amova(microbov, ~coun/spe, method = "pegas", nperm = 0)
Statphi(b)
#> [1] 0.19187860 0.10167494 0.07099783 0.03166337
Statphi(b, "h")
#>            [,1]       [,2]      [,3]
#> [1,] 0.03166337 0.10041316 0.1918786
#> [2,] 0.00000000 0.07099783 0.1654541
#> [3,] 0.00000000 0.00000000 0.1016749

# PhiST is always going to be in the top right corner of the matrix
# all other Phi values will be the diagonal. 
bc <- poppr.amova(microbov, ~coun/spe/breed, method = "pegas", nperm = 0)
pha <- Statphi(bc)
phh <- Statphi(bc, "h")
phh <- c(phh[1, ncol(phh)], rev(diag(phh)))
phh
#> [1] 0.16650510 0.05090004 0.06588374 0.02812636 0.03265754
pha
#> [1] 0.16650510 0.05090004 0.06588374 0.02812636 0.03265754

Created on 2017-12-14 by the reprex package (v0.1.1.9000).

emmanuelparadis commented 6 years ago

Great! I've modified the code so it works for all levels. The names of the returned vector gives the levels of the correlations (and the equivalence with Excoffier et al's notation in the case of two levels). Just pushed on GH. Do not hesitate to come back to me if there's something wrong. Cheers.