CompEvol / beast2

Bayesian Evolutionary Analysis by Sampling Trees
www.beast2.org
GNU Lesser General Public License v2.1
240 stars 84 forks source link

Associate data with internal nodes via IDs #1173

Open rbouckaert opened 1 day ago

rbouckaert commented 1 day ago

For fixed tree analysis or just for initialisation, it can be useful to have a starting tree where internal nodes are identiefied by a name instead of a number. Currently, only numbers are recognised, which makes it a bit cumbersome for example for a RealParameter associated with internal nodes to define its specific values.

Perhaps the easiest is to recognise the "id" metadata as special, and assign it to the node's ID, so a three node Newick tree becomes something like ((A:1.0,B:1.0)[id="D"]:1.0,C:2.0); and have a TreeTrait that define properties for A,B,C and D.

walterxie commented 1 day ago

Perhaps it could be more useful to use the standard format ((A:1.0,B:1.0)D:1.0,C:2.0)E:0.0; ?

https://en.wikipedia.org/wiki/Newick_format

rbouckaert commented 1 day ago

@walterxie TreeParser with named leafs requires a TaxonSet to associated labels with taxa, but have to treat internal nodes differently, since they are no taxa, that is, changing the TreeParser.

Right now, it is already possible to associate names with internal nodes via meta data, e.g. ((A:1.0,B:1.0)[name="D"]:1.0,C:2.0)[name="E"];, so perhaps this is just a matter of having a way to associate data with internal node names by having a StateNodeInitialiser being aware of both the tree with its internal node names and some data source.