darunabas / phyloregion

Biogeographic regionalization and spatial conservation
http://phyloregion.com/
18 stars 12 forks source link

phylobeta error #9

Open natirubirb opened 2 years ago

natirubirb commented 2 years ago

Hi! I am having some trouble running the function phylobeta. I get the following error: "Error in phylobeta_core(x, phy) : Labels of community matrix and tree differ!" I don't think that is the issue because the labels are the same. I double checked and I created a new tree using this code: treebeta<-keep.tip(timedtreeC,intersect(timedtreeC$tip.label,colnames(hope))) hope: is the name of my community matrix. I would appreciate any help with this issue. Thanks! Natali

patarol commented 2 years ago

Hi, @natirubirb.

Can you please provide a reproducible example? As you said it could be a simple name issue with the names. You can check it by doing setdiff(tip.labels(hope), names(timedtreeC)).

natirubirb commented 2 years ago

I think you meant to tell me to check for the difference on the betatree with the community column labels. I used: setdiff(tiplabels(treebeta),colnames(hope)) # I got this error Error in get("last_plot.phylo", envir = .PlotPhyloEnv) : object 'last_plot.phylo' not found

When I compared my previous tree (timedtreeC) to the betatree. It does exclude the species that were not on my community. So that works I have 337 species in my tree and 337 columns in my community with the same labels.

I am attaching the files needed. please change extention of betatree to nwk. betacommunity.csv betatree.txt

library(phyloregion) library(ape) library(tidyverse)

betacom<-read.csv("betacommunity.csv") treebeta<-ape::read.tree(file = "betatree.nwk")

hope<-dense2sparse(betacom)

phylobeta(hope, treebeta)

patarol commented 2 years ago

The problem is that your names don't match. When I compared tip labels with column names from your community data I get:

setdiff(colnames(betacom), treebeta$tip.label)

# [1] "plotID"                      "Andropogon_gerardii"        
# [3] "Chasmanthium_sessiliflorum"  "Cyperus_filiformis"         
# [5] "Cyperus_plukenetii"          "Cyperus_retrorsus"          
# [7] "Dichanthelium_angustifolium" "Dichanthelium_ovale"        
# [9] "Dichanthelium_villosissimum" "Diodia_teres"               
#[11] "Ionactis_linariifolius"      "Nyssa_sylvatica"            
#[13] "Oxalis_corniculata"          "Photinia_pyrifolia"         
#[15] "Pleopeltis_polypodioides"    "Polygala_grandiflora"       
#[17] "Polystichum_acrostichoides"  "Pteridium_aquilinum"        
#[19] "Saccharum_alopecuroides"     "Schizachyrium_scoparium"    
#[21] "Serenoa_repens"              "Vitis_rotundifolia"         
#[23] "Xyris_caroliniana"    

You can select the columns from your community data that match your treebeta object and then run phylobeta and get the result:

new_betacom <- betacom %>% 
  dplyr::select(-setdiff(colnames(betacom), treebeta$tip.label))

hope <- dense2sparse(new_betacom)
result_phybeta <- phylobeta(hope, treebeta)

str(result_phybeta)

#List of 3
# $ phylo.beta.sim: 'dist' num [1:20100] 0.158 0.199 0.255 0.258 0.331 ...
#  ..- attr(*, "Size")= int 201
#  ..- attr(*, "Diag")= logi FALSE
#  ..- attr(*, "Upper")= logi FALSE
# $ phylo.beta.sne: 'dist' num [1:20100] 0.1078 0.0656 0.1497 0.148 0.1532 ...
#  ..- attr(*, "Size")= int 201
#  ..- attr(*, "Diag")= logi FALSE
#  ..- attr(*, "Upper")= logi FALSE
# $ phylo.beta.sor: 'dist' num [1:20100] 0.266 0.265 0.404 0.406 0.485 ...
#  ..- attr(*, "Size")= int 201
#  ..- attr(*, "Diag")= logi FALSE
#  ..- attr(*, "Upper")= logi FALSE
KlausVigo commented 2 years ago

Hi @natirubirb,

dense2sparse assumes a matrix with the plotID as row names and not a data.frame with plotID in the first column. I will try to improve this and add some example to a vignette.

I discovered a little problem with your data. There are several duplicated labels for your plotID, e.g. "NC", "UN", "OF4", ... You might have sampled the same plot at several times. Do you want to combine these rows or give them unique labels? The interpretation is likely to differ quite a bit.

unique_plots <- make.unique(betacom$plotID)
combined_plots <- unique(betacom$plotID)

row.names(betacom) <- unique_plots
betacom_matrix <- as.matrix(betacom[,-1])
dim(betacom_matrix)

If you want to combine the occurrences in one plot execute the following lines, if not skip them:

x <- matrix(0, length(combined_plots), nrow(betacom), 
            dimnames = list(combined_plots, unique_plots))
x[cbind(match(betacom$plotI, combined_plots), seq_along(betacom$plotID))] <- 1
betacom_matrix <- x %*% betacom_matrix
betacom_matrix[betacom_matrix > 1] <- 1 # make presence absence matrix again
dim(betacom_matrix)

Now we are almost there:

hope<-dense2sparse(betacom_matrix)
pb <- phylobeta(hope, treebeta)

Now we get a warning, which we understand. So now take just the labels available both the tree and community matrix.

common_labels <- intersect(treebeta$tip.label, colnames(hope))
tree_cl <- keep.tip(treebeta, common_labels)
hope_cl <- hope[, common_labels]
pb <- phylobeta(hope_cl, tree_cl)

Cheers, Klaus

natirubirb commented 2 years ago

Hi Klaus,

Thank you for your input and suggestions. You are right, I have some replicates. My data frame originally had 3 columns: siteID, PlotID and subplotID. I was getting an error because of that and temporarily trimmed the data to only use plot ID without combining the data. Now I know that would create an error too. I was wondering about how you could compare the same community (plotID) from different sites (siteID) . I was hoping to compare the differences in phylogenetic diversity of similar communities in different sites. For example, for one part of my work, I have some native community plots (3 replicates) and I would like to compare them with 3 site that has been restored in the same way (fire at the same interval). Seems like I would have to choose only one plot per site if I understand the function correctly.

Another question, I don't think I understand how the comparison are done with the rows. does the function compares each row against each other? is there any way to group rows?

Thank you Klaus and paratol for your help! Natali