Closed csmiguel closed 5 years ago
Solved: ape::cophenetic.phylo
just computes distances between tips. I used ape::dist.nodes
instead, which computes all pairwise distances including patristic distances between internal nodes (as opposed to tips). It returns a matrix from which I select distances in the bottom-right quadrant, corresponding to distances between nodes only:
#Normalize length of the tree
#distance between tips only
disttips <- ape::cophenetic.phylo(tr1) %>% sort()
#distance between all tips and nodes, but selecting only the quandrant of the
# matrix with distances between tips
distnodes <- ape::dist.nodes(tr1)[1:Ntip(tr1), 1:Ntip(tr1)] %>% sort()
#asssert that the order of the columns is tips + nodes
assertthat::assert_that(all(disttips == distnodes),
msg = paste0("check order of tips and nodes in trees"))
#meaning that the bottom-right quadrant of the matrix corresponds to
#distances between internal nodes
maxd <- ape::dist.nodes(tr1)[(ape::Ntip(tr1) + 1):tr1$Nnode,
(ape::Ntip(tr1) + 1):tr1$Nnode] %>% max()
# normalize
tr1$edge.length <- tr1$edge.length / maxd
Fix on matrix selection for getting longest distance:
maxd <- ape::dist.nodes(tr1)[(ape::Ntip(tr1) + 1):(tr1$Nnode + ape::Ntip(tr1)),
(ape::Ntip(tr1) + 1):(tr1$Nnode + ape::Ntip(tr1))] %>% max()
The metrics Dnp and Dnr are done over standarized tree distances. However, in the plots "DnrDnp_ggplot_bw.pdf" it is possible to see that metric for both markers do not have the same highest value. This could be caused by the fact that the lengths of the branches supporting the trees are different between usats and SNPs. For that reason, in the meeting from Sept 13th 2019, it was suggested to normalize the the distances using the maximum distance from any internal node (non-tip nodes) instead of normalizing as I have it now:
maxd <- ape::cophenetic.phylo(tr1) %>% max()
tr1$edge.length <- tr1$edge.length / maxd