Closed wcornwell closed 6 years ago
Thanks for the positive words and the bug report. Looks like I can replicate this bug as well.
The segfault crashing R appears to arise from the plotting function in the ape
package. I'll report this to the ape maintainer and see if he has any ideas.
The example tree shown there (S100.xml
) was selected at random and seems to have some pathology for the ape
plotting function that I do not see. e.g. I do not get any errors trying other trees from the treebase data right now, e.g. S101.xml
. I'll have to dig deeper with the ape maintainer and find what it is about that tree that is causing ape to segfault.
Cheers,
Carl
My guess is one or more non-splitting nodes in one of the trees?
For what it's worth,
plot(tr_phy[[1]][[1]]) plot(tr_phy[[1]][[2]])
work but plot(tr_phy[[1]][[3]]) and plot(tr_phy[[1]][[4]]) trigger the segfault.
On Wed, Jun 10, 2015 at 12:39 AM, Will Cornwell notifications@github.com wrote:
My guess is one or more non-splitting nodes in one of the tree?
— Reply to this email directly or view it on GitHub https://github.com/ropensci/RNeXML/issues/122#issuecomment-110587041.
@emmanuelparadis points out to me:
I had a look at your trees using that function I put on github:
https://github.com/emmanuelparadis/checkValidPhylo
and they all return at least one FATAL message. I hope this may help you to fix some issues in RNeXML.
I'll look further how to catch such issues in the most efficient way in ape.
I'll have to see if these issues are already present in the TreeBASE NeXML and figure out what the best way to handle them is. Perhaps we should still still parse the NeXML with nexml_read
but refuse to return an ape format from get_trees
if the trees do not pass these checks?
Appears that ape
's routines, particularly for plotting, can still throw segfaults under a range of conditions, even for perfectly bifurcating trees. Part of the challenge is that the ape
format seems to make quite a few arbitrary assumptions about the way it represents a tree (for instance, not all topologically equivalent edge lists are the same. I don't understand why order should matter, standard representation of a node list (with possible attributes) and an edge list (with possible attributes) would be ideal). Closing this as I view these as issues better handled from ape
, a package should respond with appropriate error messages if data does not meet some requirement and not throw a segfault.
Pretty cool project! Looking forward to seeing where it's going, especially for plotting traits and phylogenies there is certainly the need for a general framework that people can build on.
Anyway was just checking out the plotting and found the following: the example plotting code from the arxiv.org vignette:
At least on my system this crashes R. This is R 3.1.3 on Mac OS with the CRAN version of the package.
In terminal I get this error:
Error: segfault from C stack overflow Warning messages: 1: In min(x) : no non-missing arguments to min; returning Inf 2: In max(x) : no non-missing arguments to max; returning -Inf 3: In min(x) : no non-missing arguments to min; returning Inf 4: In max(x) : no non-missing arguments to max; returning -Inf 5: In min(x) : no non-missing arguments to min; returning Inf 6: In max(x) : no non-missing arguments to max; returning -Inf 7: In min(x) : no non-missing arguments to min; returning Inf 8: In max(x) : no non-missing arguments to max; returning -Inf 9: In min(x) : no non-missing arguments to min; returning Inf 10: In max(x) : no non-missing arguments to max; returning -Inf Error: C stack usage 140730498580684 is too close to the limit
Then it crashes completely.