vegandevs / vegan

R package for community ecologists: popular ordination methods, ecological null models & diversity analysis
https://vegandevs.github.io/vegan/
GNU General Public License v2.0
448 stars 97 forks source link

protest() procrustes() analysis #576

Closed melissaszy closed 1 year ago

melissaszy commented 1 year ago

Hello,

I am trying to calculate the similarity between two matrices using the procustes and protest functions. I am working on comparing host phylogeny to host-associated microbes.

host_dist: a dist object. Derived phylogenetic distances from MEGA11 software, imported into R using read.csv. Removed outgroup and then used as.dist() to change the data.frame to dist object

microbe_dist_wuf: a dist object. Used UniFrac() to calculate weighted unifrac distances from a phyloseq object.

After deriving both dist objects, I placed them into procrustes() and protest() as such: procrustes(X = host_dist, Y = microbe_dist_wuf, symmetric = TRUE) protest(X = host_dist, Y = microbe_dist_wuf, scores = "sites", permutations = 999)

my questions are:

  1. Does both dist objects need to be in the same order? Both dist objects here have the same row and column names. Does column and row names need to be in the same order in host_dist and microbe_dist_wuf?
  2. is it okay to have an outgroup when generating phylogenetic distances of the host but no outgroup when generating distances between microbiota samples?

Thank you for your time.

jarioksa commented 1 year ago

Most importantly: procrustes and protest compare two configurations. It does not handle dissimilarities, but only configurations. Functions stats::cmdscale and vegan::wcmdscale can be used to map dissimilarities into configurations. It seems that functions works smoothly with dissimilarities, but the results are meaningless.

About your specific questions:

  1. You cannot use dist objects (see above), but rows (observations) must be in the same order and match in configurations. Row names are neither used nor inspected. Columns of configuration are rotated, and their names and order is irrelevant.
  2. This questions is specific to your research field. Functions are ignorant on these choices and will work in either case. Consult people working in your field.
melissaszy commented 1 year ago

Thank you so much for your reply, @jarioksa.

I made the following edits to my script

  1. I used stats::cmdscale as suggested to change my dist objects to matrix objects (each matrix object contains two columns which I now realise are the coordinates of the configuration)
  2. I ensured that the row.names of all my matrix objects are the same and in the same order
  3. I then ran procrustes() and protest() using the matrix objects
  4. I also ran mantel test using ecodist::mantel(). I ran the mantel test using dist objects

Thank you once again for your help. If I misunderstood anything, please kindly inform me.

Cheers!