jeetsukumaran / DendroPy

A Python library for phylogenetic scripting, simulation, data processing and manipulation.
https://pypi.org/project/DendroPy/.
BSD 3-Clause "New" or "Revised" License
207 stars 63 forks source link

Best way to add a long branch above root #117

Closed psathyrella closed 5 years ago

psathyrella commented 5 years ago

In order to calculate some tree metrics, I need to add a long "dummy" branch above the root node in my trees. There don't seem to be any ways to directly add a parent node (I think on purpose?), so I've done this two ways, both starting by making a new tree with a single long branch below root, then:

  1. attach the old tree's root node below this
  2. loop over every node in the old tree in preorder, attaching a new, corresponding single node to the new tree in the appropriate place for each one

My problem is that 1. seems to make an appropriate tree, but crashes while updating bipartitions, which makes me think that something horrible is messed up internally. 2. works fine except that it's ugly and very slow (I'm doing this on some very large trees).

Is there a better way to do this that's more in keeping with dendropy's design?

jeetsukumaran commented 5 years ago

Not sure I understand fully. But the root node (tree.seed_node) already has an Edge object:

t = dendropy.Tree()
print(t.seed_node)

produces:

<dendropy.datamodel.treemodel.Edge object at 0x7f40103cf9b0>

If you want another branch (and node) above this, then maybe the easiest would be to build the tree you want, then create a new, empty tree and attach the original tree's root node as a direct descendent of the new tree's root node (note you need to preserve the namespace):

t1 = ... (stuff to get build t1)
t2 = dendropy.Tree(taxon_namespace=t1.taxon_namespace)
t2.seed_node.add_child(t1.seed_node)

As for bipartition calculation: this by default eliminates unifurcations (i.e., nodes with outdegree 1). This is, of course, what you have here in t2 (or in the original method 1 you describe). You can avoid this by passing in suppress_unifurcations=False. This will create a collision in the splits-to-edge map, however, which will be resolved arbitrarily; this may or may not be an issue for your use case.

psathyrella commented 5 years ago

Ah that's perfect, thanks. Yes, that was my 1., but my problem was probably just that I was missing the suppress_unifurcation=False.