Open LinguList opened 3 years ago
Since the newick class stores nodes by means of ancestor and descendant relations, most operations should be possible by simply modifying these relations. We have gained some experience with rooted trees, so we should stick with rooted trees so far and find simple solutions for these operations. That means also, that we define a branch by a node, and say that the edge defined by a node is from the ancestor of the node to the node.
We should also add a "reroot" function, as this makes it easier to avoid complex internal rerooting during nearest neighber exchange.
I have a first implementation ready, will share later.
I learned one more thing about nearest-neighbor interchange now, when using rooted trees. It is not problematic at all, but we have three distinct scenarios:
root.descendants
and the descendants thereof. If this does NOT yield a quartet, the root is not qualified for the operation.I don't know how cost-intensive this routine is, and we may want to experiment with other ways (one can even use pure newick strick manipulation, I tested this before), but it would be what I understand as the rooted equivalent for nearest-neighbor interchange, and not that difficult to apply. As I have a first solution for re-rooting, I'd put a version of NNI along with this PR (probably next week?).
We need to swap at the root, even if it is not a quartet. If one descendant of the root is a leaf, then, we can swap the leaf with one of the descendants of the leaf's sibling. We only need to check if a branch is not external.
Commented code from here: https://github.com/pylogeny/bayes/blob/master/src/cybayes/commands/mcmc_gamma.pyx#L235
shuffle_keys = list(temp_edges_list.keys()) # postorder list of edges labeled as a tuple of parent and child
random.shuffle(shuffle_keys) # shuffle the list of branches. We can select a node randomly.
for a, b in shuffle_keys:
if b > config.N_TAXA:# and a != root_node: #select a branch if it not connected to a leaf. Only internal branch.
x, y = nodes_dict[a], nodes_dict[b] #get the descendants of the nodes
break
if x[0] == b: src = x[1] # select the sibling of the branch where NNI is performed.
else: src = x[0]
tgt = random.choice(y) #select the sibling's children
src_bl, tgt_bl = temp_edges_list[a, src], temp_edges_list[b, tgt] # update the branch lengths
del temp_edges_list[a,src], temp_edges_list[b, tgt] # delete older branches
temp_edges_list[a, tgt] = tgt_bl # update the branch lengths
temp_edges_list[b, src] = src_bl # update the branch lengths
How to update the branch lengths in pylotree
format is the question. You know better, for sure.
If we go for ultrametric trees, then, we cannot swap at the root if it is not a quartet. We need to think if we want to do ultrametric trees.
I already looked how to do it. You need to modify the ancestor and descendant relations, that's all.
Here's an example: my function for tree rerooting, for which I'll prepare the PR later (but am still thinking if we should make this a tree-internal function of the tree class, like: Tree.root_ad(node)
returning a new tree.
def get_i(nodeA, nodeB):
return [i for i in range(2) if nodeA.descendants[i].name ==
nodeB.name][0]
def root_at(otree, node_name):
"""
Root the tree at a specific node.
"""
# TODO: Edge3 does not work yet!
tree = Tree(otree.newick)
node = tree[node_name]
root = tree.root
ancestor = node.ancestor
if ancestor.name == root.name:
return otree
# split tree in half
partA = [d for d in root.descendants if [x for x in d.get_leaf_names() if x
in node.get_leaf_names()]][0]
partB = [d for d in root.descendants if not [x for x in d.get_leaf_names() if x
in node.get_leaf_names()]][0]
if partA.name != ancestor.name:
partA.ancestor = None
partB.ancestor = None
# assign new data the new root
root.descendants = [node, ancestor]
queue = [(ancestor, ancestor.ancestor)]
aa = ancestor.ancestor
ancestor.descendants[get_i(ancestor, node)] = aa #ancestor.ancestor
node.ancestor, ancestor.ancestor = root, root
while queue:
nn, aa = queue.pop(0)
if aa.ancestor:
aa.descendants[get_i(aa, nn)] = aa.ancestor
queue += [(aa, aa.ancestor)]
aa.ancestor = nn
else:
aa.descendants[get_i(aa, nn)] = partB
partB.ancestor = aa
aa.ancestor = nn
return tree
@PhyloStar will add specifics here, but we need something like:
These require that a node in the tree is placed somewhere else, so we need to be able to delete and insert parts of a tree (a node with its descendants).