YuLab-SMU / treeio

:seedling: Base Classes and Functions for Phylogenetic Tree Input and Output
https://yulab-smu.top/treedata-book/
94 stars 24 forks source link

update as.phylo and as.treedata method for data.frame class #88

Closed xiangpin closed 1 year ago

xiangpin commented 1 year ago

Description

update as.phylo and as.treedata method for data.frame class

Related Issue

87

78

Example 1

> library(treeio)
treeio v1.21.3 For help: https://yulab-smu.top/treedata-book/

If you use the ggtree package suite in published research, please cite
the appropriate paper(s):

LG Wang, TTY Lam, S Xu, Z Dai, L Zhou, T Feng, P Guo, CW Dunn, BR
Jones, T Bradley, H Zhu, Y Guan, Y Jiang, G Yu. treeio: an R package
for phylogenetic tree input and output with richly annotated and
associated data. Molecular Biology and Evolution. 2020, 37(2):599-603.
doi: 10.1093/molbev/msz240

G Yu. Data Integration, Manipulation and Visualization of Phylogenetic
Trees (1st ed.). Chapman and Hall/CRC. 2022. ISBN: 9781032233574

Guangchuang Yu. (2022). Data Integration, Manipulation and
Visualization of Phylogenetic Trees (1st edition). Chapman and
Hall/CRC. 
> df = data.frame(parent = c(5,7,7,6,5,5,6),
                 node = c(1,2,3,4,5,6,7),
                 label = c('t4','t1','t3','t2',NA,NA,NA))
> as.phylo(df, label=label)

Phylogenetic tree with 4 tips and 3 internal nodes.

Tip labels:
  t4, t1, t3, t2

Rooted; no branch lengths.
> as.treedata(df, label=label) 
'treedata' S4 object'.

...@ phylo:

Phylogenetic tree with 4 tips and 3 internal nodes.

Tip labels:
  t4, t1, t3, t2

Rooted; no branch lengths.
> as.treedata(df, label=label) %>% as_tibble()
# A tibble: 7 × 3
  parent  node label
   <dbl> <dbl> <chr>
1      5     1 t4   
2      7     2 t1   
3      7     3 t3   
4      6     4 t2   
5      5     5 NA   
6      5     6 NA   
7      6     7 NA   
> as.treedata(df) %>% as_tibble()
# A tibble: 7 × 4
  parent  node label.x label.y
   <dbl> <dbl> <chr>   <chr>  
1      5     1 1       t4     
2      7     2 2       t1     
3      7     3 3       t3     
4      6     4 4       t2     
5      5     5 5       NA     
6      5     6 6       NA     
7      6     7 7       NA     
> as.phylo(df)

Phylogenetic tree with 4 tips and 3 internal nodes.

Tip labels:
  1, 2, 3, 4
Node labels:
  5, 6, 7

Rooted; no branch lengths.

Example2

> library(ggtree) 
ggtree v3.5.2.991 For help:
https://yulab-smu.top/treedata-book/

If you use the ggtree package suite in published research, please cite
the appropriate paper(s):

Guangchuang Yu, David Smith, Huachen Zhu, Yi Guan, Tommy Tsan-Yuk Lam.
ggtree: an R package for visualization and annotation of phylogenetic
trees with their covariates and other associated data. Methods in
Ecology and Evolution. 2017, 8(1):28-36. doi:10.1111/2041-210X.12628

Guangchuang Yu. Using ggtree to visualize data on tree-like structures.
Current Protocols in Bioinformatics. 2020, 69:e96. doi:10.1002/cpbi.96

Guangchuang Yu, Tommy Tsan-Yuk Lam, Huachen Zhu, Yi Guan. Two methods
for mapping and visualizing associated data on phylogeny using ggtree.
Molecular Biology and Evolution. 2018, 35(12):3041-3043.
doi:10.1093/molbev/msy194 
> set.seed(123)
> ape::rtree(6) %>% ape::unroot() -> tr
> ggtree(tr) + geom_tiplab() -> p1
> as_tibble(tr) %>% data.frame() %>% dplyr::slice(sample(nrow(.))) -> dt
> dt
   parent node branch.length label
1       7    9    0.69942189  <NA>
2       9   10    0.32792072  <NA>
3      10    5    0.95450365    t1
4       7    3    0.89982497    t5
5       8    2    0.10292468    t6
6      10    6    0.88953932    t4
7       8    1    0.57263340    t2
8       7    8    0.67757064  <NA>
9       9    4    0.04205953    t3
10      7    7            NA  <NA>
> as.phylo(dt, branch.length, label) %>% ggtree() + geom_tiplab() -> p2
> p1 / p2
屏幕快照 2022-10-30 23 53 48

Example3

> library(treeio)
treeio v1.21.3 For help: https://yulab-smu.top/treedata-book/

If you use the ggtree package suite in published research, please cite
the appropriate paper(s):

LG Wang, TTY Lam, S Xu, Z Dai, L Zhou, T Feng, P Guo, CW Dunn, BR
Jones, T Bradley, H Zhu, Y Guan, Y Jiang, G Yu. treeio: an R package
for phylogenetic tree input and output with richly annotated and
associated data. Molecular Biology and Evolution. 2020, 37(2):599-603.
doi: 10.1093/molbev/msz240

Guangchuang Yu, Tommy Tsan-Yuk Lam, Huachen Zhu, Yi Guan. Two methods
for mapping and visualizing associated data on phylogeny using ggtree.
Molecular Biology and Evolution. 2018, 35(12):3041-3043.
doi:10.1093/molbev/msy194

Guangchuang Yu, David Smith, Huachen Zhu, Yi Guan, Tommy Tsan-Yuk Lam.
ggtree: an R package for visualization and annotation of phylogenetic
trees with their covariates and other associated data. Methods in
Ecology and Evolution. 2017, 8(1):28-36. doi:10.1111/2041-210X.12628

> library(tidygraph)
> xx <- readRDS('./tree_final_1.rds')
> tr <- as.phylo(xx)
Warning message:
The `x` argument of `as_tibble.matrix()` must have unique column names if `.name_repair` is omitted as of tibble 2.0.0.
Using compatibility `.name_repair`.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated. 
> tr

Phylogenetic tree with 3895 tips and 3894 internal nodes.

Tip labels:
  ATCCATTCACGGTAGA-1, CATTGTTGTTGCGTAT-1, TGATTCTCAGGACAGT-1, AAGATAGTCTGTGCAA-1, CATGCGGAGAGTTCGG-1, TTCTCTCCAAAGCGTG-1, ...
Node labels:
  Node1, Node2, Node3, Node4, Node5, Node6, ...

Rooted; no branch lengths.
> tr %>% plot()
屏幕快照 2022-10-31 00 03 42