adeverse / adespatial

Multivariate Multiscale Spatial Analysis
http://adeverse.github.io/adespatial/
32 stars 7 forks source link

Betadiversity decomposition apparently yields different results for Baselga method as compared to the betapart package #21

Closed ManuelPopp closed 9 months ago

ManuelPopp commented 2 years ago

I just found a question on Researchgate, why adespatial's beta.div.comp(..., coef="BJ", quant=FALSE) and betapart's beta.multi(..., index.family="jaccard") return different output, despite one would assume the functions run similar calculations, given the index family is set to "jaccard" and the method of Baselga et al is used.

So I tested the following:

require("ade4")
data(doubs)
A = doubs$fish[-8,]
A <- ifelse(A > 0, 1, 0)

require("adespatial")
beta.div.comp(A, coef="BJ", quant=FALSE)$part

Resulting in:

     BDtotal         Repl          Nes Repl/BDtotal  Nes/BDtotal 
   0.3258676    0.1674413    0.1584263    0.5138323    0.4861677

Whereas when I now ran

require("betapart")
beta.multi(A, index.family="jaccard")

it returns

$beta.JTU
[1] 0.7885784

$beta.JNE
[1] 0.1470249

$beta.JAC
[1] 0.9356033

Obviously, the values are not in the same order, but also they are completely different.

So I went to github and copied the code from the relevant functions of both packages. I changed variable names where variables had a pendant in the other package, to indicate which values or parts of the script are basically the same for both packages. This is the result:

require("adespatial")
require("betapart")
require("ade4")
# load some example data
data(doubs)
A = doubs$fish[-8,]
A <- ifelse(A > 0, 1, 0)

#-----------------------------------------------------------------------------
#### adespatial
### beta.div.comp
# first, run the function
beta.div.comp(A, coef="BJ", quant=FALSE)

## what it does:
n <- nrow(A)
a <- A %*% t(A)
b <- A %*% (1 - t(A))
c <- (1 - A) %*% t(A)
min.bc <- pmin(b, c)
D <- (b + c) / (a + b + c)               # Jaccard dissimilarity
repl <- 2 * min.bc / (a + 2 * min.bc)    # replacement, turnover
rich <- D - repl

D <- as.dist(D)
repl <- as.dist(repl)
rich <- as.dist(rich)

## output values:
# turnover/replacement
total.div <- sum(D) / (n * (n - 1))     # == mean(D) / 2; btw, why is it not "just" the mean?
# nestedness
repl.div <- sum(repl) / (n * (n - 1))   # == mean(repl) / 2
# total
rich.div <- sum(rich) / (n * (n - 1))   # == mean(rich) / 2

## the following produces the same values using betapart:
mean(beta.pair(A, index.family = "jaccard")$beta.jac) / 2
mean(beta.pair(A, index.family = "jaccard")$beta.jtu) / 2
mean(beta.pair(A, index.family = "jaccard")$beta.jne) / 2

#-----------------------------------------------------------------------------
#### betapart
### beta.multi
# first, run the function
beta.multi(A, index.family="jaccard")

## what it does:
a <- A %*% t(A)
c <-  abs(sweep(a, 2, diag(a)))
sumSi <- sum(diag(a))                   # species by site richness
St <- sum(colSums(A) > 0)               # regional species richness; or ncol(A), if all columns contain values > 0
ms.a <- sumSi - St                      # multi site shared species term

max.bc <- pmax(c, t(c))
min.bc <- pmin(c, t(c))

sum.max.bc <- sum(max.bc[lower.tri(max.bc)]) # == sum(as.dist(max.bc))
sum.min.bc <- sum(min.bc[lower.tri(min.bc)]) # == sum(as.dist(min.bc))

## output values:
# turnover/replacement
beta.jtu <- (2 * sum.min.bc) / (ms.a + (2 * sum.min.bc))
# nestedness
beta.jne <- (ms.a / (ms.a + (2 * sum.min.bc))) * ((sum.max.bc - sum.min.bc) / ((ms.a) + sum.max.bc + sum.min.bc))
# total
beta.jac <- (sum.min.bc + sum.max.bc) / (ms.a + sum.min.bc + sum.max.bc)

As you can see, there are some basic equations (the ones described in the relevant papers on the partitioning of beta diversity), which are similar for both approaches. However, the adespatial function first calculates some diversity matrices and then sums them up while the betapart approach first summarises the input matrix to obtain single values and then applies the equations for betadiversity decomposition.

Now, my question would be: Why are there different outputs? Are there errors in the code, or are the functions supposed to behave differently? In case the differences are intended, could you please elaborate why or add to the documentation?

sdray commented 2 years ago

Hi Manuel. I forwarded your message to Pierre Legendre who implemented the function but do not have a github account. I come back when I receive his answer.

Cheers

legendre commented 2 years ago

Baselga Beta decomposition.txt

ManuelPopp commented 2 years ago

Hi together, thank you a lot for your fast replies. I understood that there are different things calculated by beta.div.comp and beta.multi and that the calculation by beta.div.comp is based on the pairwise dissimilarity measures, as I showed in my example. The main question was why that is and, since there are different results: which one to use. As P. Legendre stated "Do NOT use function beta.pair.R available in betapart". I take this note is with regard to the purpose of calculating multi-site dissimilarities. Baselga (2012; doi: 10.1111/j.1466-8238.2011.00756.x) stressed that multiple-site dissimilarity measures have to be used when more then two sites are compared, i.e., we have to use beta.multi(), which produces different results than sum(pairwise_diversity_matrix) / (n * (n - 1)). I suppose this means that beta.div.comp()$part should not be used for multiple-site beta diversity measures, since it is also based on the pairwise diversity matrices. Did I now understand correctly?

legendre commented 2 years ago

P. Legendre answer to Manuel Popp, 06Apr2022.txt

ManuelPopp commented 2 years ago

@sdray With regard to the code used in adespatial (and in betapart), I now know that the differences arise from different equations implemented (for partly different purpose). This means you can basically close this issue here, since all coding-related questions are solved.

@legendre Baselga's betapart::beta.multi() description states "Computes 3 multiple-site dissimilarities accounting for the spatial turnover and the nestedness components of beta diversity". And "beta.SOR = value of the overall beta diversity, measured as Sorensen dissimilarity". You say "I don't read measures of variance, or beta diversity, in [the beta.JAC and beta.SOR] equations." To a student who has not followed the debate about beta diversity (partitioning) and related indices from the beginning, this is a bit confusing. I assume, Baselga has a wider view of the term. Anyways, I will probably have to carefully read through the debate (again) from the start. Thank you for your contributions here; it helped me gain a better understanding.

Best regards, Manuel

mar-iana commented 9 months ago

Hello, sorry for the trivial question, but do you know the reason why adespatial form the CRAN repository was removed? Thanks in advance.

sdray commented 9 months ago

Hi, I have been ripleyed. My university installed an email filtering system to reduce spams. B. Ripley sent me an email and received a message from the filterig system. Then, he decides to remove my packages as the system violates the CRAN policy. I will resubmit the packages soon with a new email address. As an alternative, you can install the version from github. Sorry for the inconvenience.

mar-iana commented 9 months ago

Thank you very much for the quick answer. I hope it is resolved for the good of the community đŸ˜„. I have already downloaded it from github, thanks! Regards

aursiber commented 8 months ago

@mar-iana FYI, adespatial is back on CRAN since October 12th :-)