emmanuelparadis / ape

analysis of phylogenetics and evolution
http://ape-package.ird.fr/
GNU General Public License v2.0
52 stars 11 forks source link

new function: keep.as.tip #87

Closed HedvigS closed 1 year ago

HedvigS commented 1 year ago

new function suggestion for ape. takes a list of tips and/or node labels and returns a tree pruned to those. If node label, then it prunes all descendants of that node until that internal node becomes a tip.

KlausVigo commented 1 year ago

Dear Hedvig & Emmanuel, it seems useful to add a ... argument to keep.tip to access all the drop.tip arguments. drop.tip has an argument trim.internal, which does not prune away internal edges. In some cases having access to trim.internal=FALSE might be exactly what's intended with this pull request. Emmanuel, I can make a pull request if you want. Probably I also should allow characters in the Descendants, Ancestors and Siblings functions in phangorn ;) .
A different strategy is calling drop.tip first to remove all descendants from the internal nodes you want to keep as tips with trim.internal=FALSE. Than call keep.tip on this tree to remove all other tips you don't want to keep.

tmp <- Descendants(tree, nodes_to_keep) |> unlist() |> unique() 
tree <- drop.tip(tree, tmp, trim.internal=FALSE)
tree <- keep.tip(tree, tips_and_nodes_to_keep)

Kind regards, Klaus

HedvigS commented 1 year ago

Thanks @KlausVigo !

In some cases having access to trim.internal=FALSE might be exactly what's intended with this pull request. Emmanuel, I can make a pull request if you want.

Yes, sometimes but not always. This function that I wrote is the result of a lot of testing of different scenarios. I wrote more here too: https://hedvigsr.tumblr.com/post/676182151636647936/pruning-to-nodes-and-keeping-tips

KlausVigo commented 1 year ago

Hi @HedvigS, small improvement to my comment. I changed the code in phangorn on github, so Descendants and some other functions now accepts a character vector. In ape I added to keep.tip a ... argument. Another problem was that drop.tip(tree, tmp, subtree = TRUE), calls the node [3_tips] or [7_tips], instead of the node names. I change the code so it takes node.labels if these are available. The following code seems to do what you want:

#let’s install and load the necessary R packages first
remotes::install_github("KlausVigo/phangorn")
remotes::install_github("KlausVigo/ape")
library(ape)
library(phangorn)

tree_string <- "((((chol1282:1,(buen1245:1,mira1253:1,tamu1247:1)taba1266:1)chol1281:1,(chol1283:1,chor1273:1)chor1272:1,epig1241:1)chol1287:1,((chan1320:1,tena1239:1)tzel1254:1,tzot1259:1)tzel1253:1)chol1286:1,((chuj1250:1,tojo1241:1)chuj1249:1,((west2635:1,popt1235:1,qanj1241:1)kanj1263:1,(moto1243:1,tuza1238:1)moch1257:1)kanj1262:1)kanj1261:1)west2865:1;"

tree <- ape::read.tree(text = tree_string)
par(mfrow=c(3,1), oma=c(1,1,1,1))
plot(tree, show.node.label = TRUE)
nodes_to_keep <- c("kanj1261", "tzel1253")
tips_and_nodes_to_keep <- c("kanj1261", "tzel1253", "epig1241")
tips_to_remove <-  Descendants(tree, nodes_to_keep) |> unlist() |> unique()
tree_1 <- drop.tip(tree, tips_to_remove, subtree = TRUE)
plot(tree_1, show.node.label = TRUE)
tree_2 <- keep.tip(tree_1, tips_and_nodes_to_keep, collapse.singles=FALSE)
plot(tree_2, show.node.label = TRUE)
HedvigS commented 1 year ago

Hi @HedvigS, small improvement to my comment. I changed the code in phangorn on github, so Descendants and some other functions now accepts a character vector. In ape I added to keep.tip a ... argument. Another problem was that drop.tip(tree, tmp, subtree = TRUE), calls the node [3_tips] or [7_tips], instead of the node names. I change the code so it takes node.labels if these are available. The following code seems to do what you want:

#let’s install and load the necessary R packages first
remotes::install_github("KlausVigo/phangorn")
remotes::install_github("KlausVigo/ape")
library(ape)
library(phangorn)

tree_string <- "((((chol1282:1,(buen1245:1,mira1253:1,tamu1247:1)taba1266:1)chol1281:1,(chol1283:1,chor1273:1)chor1272:1,epig1241:1)chol1287:1,((chan1320:1,tena1239:1)tzel1254:1,tzot1259:1)tzel1253:1)chol1286:1,((chuj1250:1,tojo1241:1)chuj1249:1,((west2635:1,popt1235:1,qanj1241:1)kanj1263:1,(moto1243:1,tuza1238:1)moch1257:1)kanj1262:1)kanj1261:1)west2865:1;"

tree <- ape::read.tree(text = tree_string)
par(mfrow=c(3,1), oma=c(1,1,1,1))
plot(tree, show.node.label = TRUE)
nodes_to_keep <- c("kanj1261", "tzel1253")
tips_and_nodes_to_keep <- c("kanj1261", "tzel1253", "epig1241")
tips_to_remove <-  Descendants(tree, nodes_to_keep) |> unlist() |> unique()
tree_1 <- drop.tip(tree, tips_to_remove, subtree = TRUE)
plot(tree_1, show.node.label = TRUE)
tree_2 <- keep.tip(tree_1, tips_and_nodes_to_keep, collapse.singles=FALSE)
plot(tree_2, show.node.label = TRUE)

Oh wow!!! Yes I think it does, I haven't had time to test but looks good! A function would be nice though, just to make it neat :). Either in ape or phangorn.

emmanuelparadis commented 1 year ago

Hi! Thanks for the great discussion! At the moment, I don't have eough free time to dive into all these code suggestions, but I wish to point out that ape cannot host scripts or very specific/specialized functions. ape is already a (relatively) big package and maintaince is difficult because of the many reverse dependencies. @HedvigS: As you mentioned on your blog, there are many other scripts existing on the Internet. However, I can see that a database/repository/list of those scripts would be a great resource for the community (maybe just with a title, a [short] description, and the link). Just a thought. (While writing this, I realize that most peer-reviewed journals ask for the scripts to be deposited somewhere, so there may be already a lot of resources out there.) I'll have a look at your codes later. Cheers,

HedvigS commented 1 year ago

Hi! Thanks for the great discussion! At the moment, I don't have eough free time to dive into all these code suggestions, but I wish to point out that ape cannot host scripts or very specific/specialized functions. ape is already a (relatively) big package and maintaince is difficult because of the many reverse dependencies. @HedvigS: As you mentioned on your blog, there are many other scripts existing on the Internet. However, I can see that a database/repository/list of those scripts would be a great resource for the community (maybe just with a title, a [short] description, and the link). Just a thought. (While writing this, I realize that most peer-reviewed journals ask for the scripts to be deposited somewhere, so there may be already a lot of resources out there.) I'll have a look at your codes later. Cheers,

No worries! I totally get it. I really like ape, and a function of this kind seemed like a natural extension of keep.tip(), but there's no urgency. Please take you time, think about it and don't worry :)

emmanuelparadis commented 1 year ago

@KlausVigo I checked your PR (#90) which sounds great to me. @HedvigS does that make happy?

emmanuelparadis commented 1 year ago

closing this after merging PR #90

HedvigS commented 1 year ago

@KlausVigo I checked your PR (#90) which sounds great to me. @HedvigS does that make happy?

I'm sorry I didn't respond earlier Emmanuel! I was on vacation, and then things piled up. I'm going to have a look at it now but I'm confident you've done a great implementation :).