original read.nhx is too slow to parse large tree file, since it has loop to replace the node labels. I used gsub to ajust the metadata to labels. e.g. "t1:0.03[&&NHX:test=1]" to "t1[&&NHX:test=1]:0.03", then the next steps are similar with processing of beast file.
I also used gsub to ajust the metadata of beast for "t1:0.01[&test=1,rare=0.5]" to "t1[&test=1,rare=0.5]:0.01"
adjust the type of metadata of astral tree file to numetic when it is needed.
Related Issue
the large nhx tree file in #12
the rd and examples of write.beast.newick and read.beast.newick in #50 was incorrect. I has adjusted them
root.phylo has been removed in treeio since it will generate warnings. #45
Example
> library(treeio)
treeio v1.15.5 For help: https://yulab-smu.top/treedata-book/
If you use treeio in published research, please cite:
LG Wang, TTY Lam, S Xu, Z Dai, L Zhou, T Feng, P Guo, CW Dunn, BR Jones, T Bradley, H Zhu, Y Guan, Y Jiang, G Yu. treeio: an R package for phylogenetic tree input and output with richly annotated and associated data. Molecular Biology and Evolution 2020, 37(2):599-603. doi: 10.1093/molbev/msz240
system.time(tr <- read.nhx("./test.nhx"))
user system elapsed
2.016 0.016 2.044
tr
'treedata' S4 object that stored information of
'./test.nhx'.
...@ phylo:
Phylogenetic tree with 20000 tips and 19999 internal nodes.
I've found that read.beast is also slow. If you want read in a beast tree file with 10,000 trees, it is basically impossible. I wonder if similar improvements could be made to read.beast.
Description
read.nhx
is too slow to parse large tree file, since it has loop to replace the node labels. I usedgsub
to ajust themetadata
to labels. e.g."t1:0.03[&&NHX:test=1]"
to"t1[&&NHX:test=1]:0.03"
, then the next steps are similar with processing ofbeast
file.gsub
to ajust themetadata
ofbeast
for"t1:0.01[&test=1,rare=0.5]"
to"t1[&test=1,rare=0.5]:0.01"
astral
tree file to numetic when it is needed.Related Issue
rd
andexamples
ofwrite.beast.newick
andread.beast.newick
in #50 was incorrect. I has adjusted themroot.phylo
has been removed intreeio
since it will generate warnings. #45Example
If you use treeio in published research, please cite:
LG Wang, TTY Lam, S Xu, Z Dai, L Zhou, T Feng, P Guo, CW Dunn, BR Jones, T Bradley, H Zhu, Y Guan, Y Jiang, G Yu. treeio: an R package for phylogenetic tree input and output with richly annotated and associated data. Molecular Biology and Evolution 2020, 37(2):599-603. doi: 10.1093/molbev/msz240
...@ phylo: Phylogenetic tree with 20000 tips and 19999 internal nodes.
Tip labels: aaaaaabmnp, aaaaaabmnq, aaaaaabmnr, aaaaaabmns, aaaaaabmnt, aaaaaabmnu, ... Node labels: , NoName, NoName, NoName, NoName, NoName, ...
Rooted; no branch lengths.
with the following features available: 'dist', 'support'.