vegandevs / vegan

R package for community ecologists: popular ordination methods, ecological null models & diversity analysis
https://vegandevs.github.io/vegan/
GNU General Public License v2.0
454 stars 98 forks source link

Custom dissimilarity matrix for metaMDS #332

Closed KasperSkytte closed 5 years ago

KasperSkytte commented 5 years ago

Is it possible to provide a distance/dissimilarity matrix manually to metaMDS(), ie to use UniFrac distances or others, without the inherent distance argument in the function calculates distances again on top? Would it make sense to just set distance = "euclidean"?

gavinsimpson commented 5 years ago

Yes, it is possible: ?metaMDS() describes the comm argument as:

Community data. Alternatively, dissimilarities either as a dist structure or as a symmetric square matrix. In the latter case all other stages are skipped except random starts and centring and pc rotation of axes.

So you should be able to pass your UniFrac distances as wither a "dist" class object or as a square symmetric matrix. If you can get your distances in one of those formats, just pass them as the first argument, and then, as indicated, most other steps will be skipped, which will involve ignoring the distance argument.

KasperSkytte commented 5 years ago

Hi again

I tried that and wondered why I didn't get the same or similar results when providing a dist object instead. Sorry, I should have provided some example code. A simple example like this produces very different results.

library(vegan)
#> Loading required package: permute
#> Loading required package: lattice
#> This is vegan 2.5-6
data("varespec")
distmatrix <- vegdist(varespec, method = "bray")
model1 <- metaMDS(distmatrix, trace = FALSE)
model1
#> 
#> Call:
#> metaMDS(comm = distmatrix, trace = FALSE) 
#> 
#> global Multidimensional Scaling using monoMDS
#> 
#> Data:     distmatrix 
#> Distance: bray 
#> 
#> Dimensions: 2 
#> Stress:     0.1000211 
#> Stress type 1, weak ties
#> Two convergent solutions found after 20 tries
#> Scaling: centring, PC rotation 
#> Species: scores missing
model2 <- metaMDS(varespec, distance = "bray", trace = FALSE)
model2
#> 
#> Call:
#> metaMDS(comm = varespec, distance = "bray", trace = FALSE) 
#> 
#> global Multidimensional Scaling using monoMDS
#> 
#> Data:     wisconsin(sqrt(varespec)) 
#> Distance: bray 
#> 
#> Dimensions: 2 
#> Stress:     0.1825658 
#> Stress type 1, weak ties
#> Two convergent solutions found after 20 tries
#> Scaling: centring, PC rotation, halfchange scaling 
#> Species: expanded scores based on 'wisconsin(sqrt(varespec))'
procrustes(model1, model2)
#> 
#> Call:
#> procrustes(X = model1, Y = model2) 
#> 
#> Procrustes sum of squares:
#> 0.7214

Created on 2019-10-17 by the reprex package (v0.3.0)

But I now see the problem. Using metaMDS(varespec, distance = "bray") also performs standardization and transformation. So setting autotransform = FALSE resolves the issue:

library(vegan)
#> Loading required package: permute
#> Loading required package: lattice
#> This is vegan 2.5-6
data("varespec")
distmatrix <- vegdist(varespec, method = "bray")
model1 <- metaMDS(distmatrix, trace = FALSE, autotransform = FALSE)
model1
#> 
#> Call:
#> metaMDS(comm = distmatrix, autotransform = FALSE, trace = FALSE) 
#> 
#> global Multidimensional Scaling using monoMDS
#> 
#> Data:     distmatrix 
#> Distance: bray 
#> 
#> Dimensions: 2 
#> Stress:     0.1000211 
#> Stress type 1, weak ties
#> Two convergent solutions found after 20 tries
#> Scaling: centring, PC rotation 
#> Species: scores missing
model2 <- metaMDS(varespec, distance = "bray", trace = FALSE, autotransform = FALSE)
model2
#> 
#> Call:
#> metaMDS(comm = varespec, distance = "bray", autotransform = FALSE,      trace = FALSE) 
#> 
#> global Multidimensional Scaling using monoMDS
#> 
#> Data:     varespec 
#> Distance: bray 
#> 
#> Dimensions: 2 
#> Stress:     0.1000211 
#> Stress type 1, weak ties
#> Two convergent solutions found after 20 tries
#> Scaling: centring, PC rotation, halfchange scaling 
#> Species: expanded scores based on 'varespec'
procrustes(model1, model2)
#> 
#> Call:
#> procrustes(X = model1, Y = model2) 
#> 
#> Procrustes sum of squares:
#> 2.013e-10

Created on 2019-10-17 by the reprex package (v0.3.0)

gavinsimpson commented 5 years ago

Also note that metaMDS() is doing random starts, so I wouldn't even expect the same call run twice in succession to return the same value.

gavinsimpson commented 5 years ago

Yeah; if you're comparing these you will need to be sure to turn off all options that metaMDS() automatically turns off when you pass it a dissimilarity matrix.

Glad you got to the bottom of the apparent difference.