emmanuelparadis / ape

analysis of phylogenetics and evolution
http://ape-package.ird.fr/
GNU General Public License v2.0
52 stars 11 forks source link

write.tree() may export duplicate internal node labels without notification #113

Closed stitam closed 5 months ago

stitam commented 5 months ago

Hi,

My issue is the following: It may happen that a tree object ends up with less internal node labels than the number of internal nodes. When I export such an object with write.tree(), the function will start recycling internal node names, leading to duplicated internal node labels:

# create a random tree
set.seed(0)
tree <- ape::rtree(10)
# define node labels
tree$node.label <- paste0("Node_", 1:(tree$Nnode-2))
tree$node.label
#> [1] "Node_1" "Node_2" "Node_3" "Node_4" "Node_5" "Node_6" "Node_7"
# export and import
ape::write.tree(tree, file = "tree.nwk")
tree2 <- ape::read.tree("tree.nwk")
tree2$node.label
#> [1] "Node_1" "Node_2" "Node_3" "Node_4" "Node_5" "Node_6" "Node_7" "Node_1"
#> [9] "Node_2"

Created on 2024-03-22 with reprex v2.1.0

Linked issue providing an example where such a tree is accidentally generated: https://github.com/ms609/TreeTools/issues/149.

For the sake of addressing this issue from multiple directions, I wonder if write.tree() could be updated to at least give a warning message when node names are being recycled (stopping with an informative error would be even better). Do you think this is reasonable? Many thanks.

emmanuelparadis commented 5 months ago

Hi, Have you tried:

R> tree <- makeNodeLabel(tree, prefix = "Node_")
R> tree$node.label
[1] "Node_1" "Node_2" "Node_3" "Node_4" "Node_5" "Node_6"
[7] "Node_7" "Node_8" "Node_9"

?

ms609 commented 5 months ago

I think the issue here is the behaviour of write.tree() when a tree has been created by another (potentially erroneous) process, such that length(tree$node.label) < tree$Nnode.

I've offered a possible solution at #115.

stitam commented 5 months ago

Thanks @emmanuelparadis! Once I discovered the root of the problem I could fix it easily; I opened the issues because I thought it might be better for the long term to highlight/eliminate the inconsistency. Thanks @ms609 I think an informative warning like that is a good solution from ape's side; It is probably not within ape's scope to fix an object that was created by another package.