ropensci / allodb

An R package for biomass estimation at extratropical forest plots.
https://docs.ropensci.org/allodb/
GNU General Public License v3.0
36 stars 11 forks source link

Taxonomic distance #97

Closed gonzalezeb closed 4 years ago

gonzalezeb commented 5 years ago

I am writing this as an issue to make sure I understand what we are trying to do. For the taxonomic distance, we are trying to evaluate how closely related are species from a site to species listed in our equation table, that will help us on weighting (ranking) the equation.

We may need a look up table (a plant phylogeny tree?) to source? I am not sure.

Please correct if I am wrong @cpiponiot @teixeirak

teixeirak commented 5 years ago

Correct.

cpiponiot commented 5 years ago

To calculate phylogenetic distances, here are some interesting R packages:

gonzalezeb commented 5 years ago

I think I got it @cpiponiot

install.packages("ape")

install.packages("brranching")

install.packages("phytools")

install.packages("maps")

install.packages("adephylo")

install.packages("V.PhyloMaker") --this didn't work

so install github version

install.packages("remotes")

remotes::install_github("jinyizju/V.PhyloMaker") library(maps) library(phytools) library(adephylo) library(ape) library(brranching)

Example

We need to construct a tree to find the evolutionary distance between species

use function phylomatic from package brranching

p <- c("Quercus alba","Castanea dentata", "Liriodendron tulipifera", "Abies alba", "Acer rubrum") tree <- phylomatic(taxa=p, get = 'POST') plot(tree, no.margin=TRUE) image

but this tree has no edge.lenght (or branch length) therefore we cannot calculate distance

edge.lenght=a numeric vector giving the lengths of the branches given by edge

str(tree) List of 5 $ edge : num [1:8, 1:2] 6 7 8 9 9 8 7 6 7 8 ... $ Nnode : int 4 $ tip.label : chr [1:5] "quercus_alba" "castanea_dentata" "acer_rubrum" "liriodendron_tulipifera" ... $ edge.length: NULL $ node.label : chr [1:4] "seedplants" "magnoliales_to_asterales" "" "fagaceae" $ attr(, "class")= chr [1:2] "phylo" "phylomatic" $ attr(, "order")= chr "cladewise"

so we need to use the function modified.Grafen(tree, power=2) from pack phytools to compute edge lenghts

tree<-modified.Grafen(tree, power=2) node.paths(tree, node)

str(tree) List of 5 $ edge : num [1:8, 1:2] 6 7 8 9 9 8 7 6 7 8 ... $ Nnode : int 4 $ tip.label : chr [1:5] "quercus_alba" "castanea_dentata" "acer_rubrum" "liriodendron_tulipifera" ... $ edge.length: num [1:8] 0.36 0.28 0.2 0.12 0.12 0.32 0.6 0.96 $ node.label : chr [1:4] "seedplants" "magnoliales_to_asterales" "" "fagaceae" $ attr(, "class")= chr [1:2] "phylo" "phylomatic" $ attr(, "order")= chr "cladewise"

Now use the fuction cophenetic from the pack ape to calculate pairwise distance

or distance between the pairs of tips from a phylogenetic tree using its branch lengths

dist.nodes does the same but between all nodes, internal and terminal, of the tree

image image

We can also calculate different set of distances using the function distTips fomr the pack adephylo

see the vignette for explanation: https://rdrr.io/cran/adephylo/man/distTips.html

distTips(tree, 1:3) distTips(tree, 1:3, "nNodes") distTips(tree, 1:3, "Abouheif") distTips(tree, 1:3, "sumDD")

image

and we can plot the tree with distances

plotTree(tree) edgelabels(round(tree$edge.length,3),cex=0.7)

image

Please use scbi as example and let me know what you think

scbi<-read.csv ("scbi.spptable.csv") #find data here scbtree <- phylomatic(taxa=scbi$Latin, get = 'GET') plot(scbtree, no.margin=TRUE) str(scbtree)

scbtree<-modified.Grafen(scbtree, power=2) cophenetic(scbtree) dist.nodes(scbtree)

plotTree(scbtree)

labels still need some work

edgelabels(round(scbtree$edge.length,3),cex=0.7)

also read this: https://github.com/jinyizju/S.PhyloMaker

for plotting: #http://www.phytools.org/anthrotree/plot/

cpiponiot commented 5 years ago

wow nice work @gonzalezeb ! we can use something like cophenetic(tree) in the weighting function