YuLab-SMU / treeio

:seedling: Base Classes and Functions for Phylogenetic Tree Input and Output
https://yulab-smu.top/treedata-book/
96 stars 26 forks source link

update read.nhx to parse large tree file #51

Closed xiangpin closed 3 years ago

xiangpin commented 3 years ago

Description

Related Issue

If you use treeio in published research, please cite:

LG Wang, TTY Lam, S Xu, Z Dai, L Zhou, T Feng, P Guo, CW Dunn, BR Jones, T Bradley, H Zhu, Y Guan, Y Jiang, G Yu. treeio: an R package for phylogenetic tree input and output with richly annotated and associated data. Molecular Biology and Evolution 2020, 37(2):599-603. doi: 10.1093/molbev/msz240

system.time(tr <- read.nhx("./test.nhx")) user system elapsed 2.016 0.016 2.044 tr 'treedata' S4 object that stored information of './test.nhx'.

...@ phylo: Phylogenetic tree with 20000 tips and 19999 internal nodes.

Tip labels: aaaaaabmnp, aaaaaabmnq, aaaaaabmnr, aaaaaabmns, aaaaaabmnt, aaaaaabmnu, ... Node labels: , NoName, NoName, NoName, NoName, NoName, ...

Rooted; no branch lengths.

with the following features available: 'dist', 'support'.


<!--- Did you remember to include tests? Unless you're just changing
grammar, please include new tests for your change -->
GuangchuangYu commented 3 years ago

good work :)

brj1 commented 3 years ago

I've found that read.beast is also slow. If you want read in a beast tree file with 10,000 trees, it is basically impossible. I wonder if similar improvements could be made to read.beast.