Open drabosky opened 7 years ago
I wonder if something is up with ape
. I am looking at the C++ code and am confused.
Anyway, whales
passes our phyx
ultrametricity test, as well as that of phylobase
:
require(phylobase);
phy <- as(whales, "phylo4");
isUltrametric(phy);
[1] TRUE
It seems like the auto tol settings are not doing what they used to. There's a numerical differential in root-to-tip distances of 4e-6, but now causes tree to fail checks. This is an issue because other packages (like diversitree) use is.ultrametric and now reject this (and other) datasets.
Tree first fails with: is.ultrametric(whales, tol = 1e-8) all other manual tol > 1e-8 passes the check.
default tol = .Machine$double.eps^0.5 = 1.490116e-08 anything different about this?
Here is the function from apev3.5
(note same tol
):
is.ultrametric.OLD <- function(phy, tol = .Machine$double.eps^0.5)
{
if (!inherits(phy, "phylo"))
stop('object "phy" is not of class "phylo"')
if (is.null(phy$edge.length))
stop("the tree has no branch lengths")
phy <- reorder(phy)
n <- length(phy$tip.label)
e1 <- phy$edge[, 1]
e2 <- phy$edge[, 2]
EL <- phy$edge.length
## xx: vecteur donnant la distance d'un noeud
## ou d'un tip a partir de la racine
xx <- numeric(n + phy$Nnode)
## the following must start at the root and follow the
## edges contiguously; so the tree must be either in cladewise
## order (or in pruningwise but the for loop must start from
## the bottom of the edge matrix)
for (i in seq_len(length(e1)))
xx[e2[i]] <- xx[e1[i]] + EL[i]
isTRUE(all.equal.numeric(var(xx[1:n]), 0, tolerance = tol))
}
and whales passes:
is.ultrametric.OLD(whales);
[1] TRUE
It seems the old function is available in apev4
:
ape:::.is.ultrametric_ape(whales, tol=.Machine$double.eps^0.5, option=2, n=length(whales$tip.label));
[1] TRUE
The important arg is option
. It factors into this switch:
crit <- switch(option, {
mn <- min(xx.tip)
mx <- max(xx.tip)
(mx - mn)/mx
}, var(xx.tip))
Setting option
to 2 checks whether the variance of the root-to-tip paths are < tol
. If option
is set to 1 it fails:
ape:::.is.ultrametric_ape(whales, tol=.Machine$double.eps^0.5, option=1, n=length(whales$tip.label));
[1] FALSE
So, for the time being you could wrap ape:::.is.ultrametric_ape
to get the old behaviour while you figure things out.
Right, but I think the issue is that other people will be reading in the whales tree and be surprised when their (non-bammtools) packages report that the tree isn't ultrametric.
I wrote a small routine to try to get the tip lengths to cooperate but it's incredibly slow and I'm not convinced that it's actually going to work.
library(ape)
phy <- read.tree("whaletree.tre")
compute_tip_lens <- function(phy) {
e1 <- phy$edge[, 1]
e2 <- phy$edge[, 2]
EL <- phy$edge.length
xx <- numeric(n + phy$Nnode)
for (i in seq_len(length(e1))) xx[e2[i]] <- xx[e1[i]] + EL[i]
xx.tip <- xx[1:n]
}
compute_ultrametric_stat <- function(xx.tip) {
mn <- min(xx.tip)
mx <- max(xx.tip)
(mx - mn)/mx
}
is_ultrametric <- function(phy, tol = .Machine$double.eps^0.5) {
xx.tip <- compute_tip_lens(phy)
isTRUE(all.equal.numeric(compute_ultrametric_stat(xx.tip), 0, tolerance = tol))
}
ctr <- 0
while (!is_ultrametric(phy)) {
tiplen <- compute_tip_lens(phy)
idx <- which.min(tiplen)
phy$edge.length[idx] <- phy$edge.length[idx] - .Machine$double.eps
ctr <- ctr + 1
if (ctr %% 1000 == 0) {
cat(ctr, compute_ultrametric_stat(tiplen), fill = T)
}
}
Yeah, the incompatibilities in other packages will be a problem. I ran into this issue with final analyses in our FiSSE paper (non bammtools packages) and had to include a function check_and_fix_ultrametric to pre-process trees before using in diversitree etc: https://github.com/macroevolution/fisse/blob/master/run_fisse/traitDependent_functions.R#L3-L29
I'm going to send a link to this thread to Emmanuel
Hi guys,
Yes is.ultrametric() has changed with ape 4. This is explained in the help page (and also why the default has changed). Maybe you want to use is.ultrametric(phy, option = 2)
?
Ah, the new ((max - min/max)) default is invariant to the linear scale of the branch lengths. Well, that makes sense, I guess.
...So, why are supposedly ultrametric trees failing under the new default?
the default check seems to be overly sensitive relative to trees produced by BEAST, TreePL, etc at least w a few trees I've tried which I can imagine could cause problems for some users who will just use the defaults. Looks like Rich FitzJohn just made some changes to diversitree on Github on account of this but I haven't looked deeper to see if the package is now ape 4.0 compatible.
Doh, looked like none of use thought to RTFM! :bowtie:
FYI, this may be relevant, if you haven't seen it already: Liam has just posted a new function for phytools that will force a tree to be ultrametric, so to pass is.ultrametric. Apparently a key issue is trees read from files with low numerical precision.
http://blog.phytools.org/2017/03/forceultrametric-method-for-ultrametric.html
Thanks Dave! I've just posted a comment on Liam's blog.
Hej,
I just wanted to let you know that the change seems to have broken my code, too. I for instance calculate phylogenetic diversity with the entropart
package which contains a function Preprocess.Tree
that in turn internal calls ape::as.hclust.phylo
which calls is.ultrametric
which now fails. I checked and the tree is ultrametric if I set the tol
to tol=.Machine$double.eps^0.31
but not for lower tolerances. It works also with option = 2
, but again it's a function that calls a function that calls is.ultrametric
so i would have to redefine the function to make it work. I saw the suggestion from Liam, but the nnls
option takes very long to calculate for my tree (16S tree with 8187 tips).
Update: for what it's worth, @drabosky function works in a reasonable time but it's not an ideal solution...
the default check seems to be overly sensitive relative to trees produced by BEAST, TreePL, etc at least w a few trees I've tried which I can imagine could cause problems for some users who will just use the defaults. Looks like Rich FitzJohn just made some changes to diversitree on Github on account of this but I haven't looked deeper to see if the package is now ape 4.0 compatible.
Diversitree is in "appease CRAN only" maintenance mode and this my changes were pretty much to stop the function running on some tests (that aren't run on CRAN anyway). Thanks for everyone for finding the actual issue :grinning:
Whales tree now fails ultrametric test in ape. Pretty sure this didn't happen previously. It's definitely a tiny numerical issue and easily fixed, but we should fix tree by adding trivial branch length to tips to ensure it passes ape ultrametric check.