pylogeny / tree

Basic functionalities for handling phylogenetic trees, expanding the newick package
Apache License 2.0
0 stars 0 forks source link

refactored package #2

Closed xrotwang closed 3 years ago

xrotwang commented 3 years ago

Since this is intended to be a micro package (?), I just turned into into a single module, saving some of the overhead that comes with packages (i.e. __init__.py).

I also refactored the Tree class to not store the node_dict. The advantage of this is that tree manipulations like renaming nodes or pruning are possible and __getitem__ will still work as expected. This comes at the disadvantage of having to traverse the tree for each __getitem__ call. So my proposal follows the "pre-mature optimization is the root of all evil" mantra. Do we know of any performance characteristics this code should have right now?

closes #1

LinguList commented 3 years ago

@PhyloStar, what do you think about performance? Do we need to access nodes often?

LinguList commented 3 years ago

@PhyloStar, what do you think about performance? Do we need to access nodes often?

LinguList commented 3 years ago

For me, this is very useful, and I'll probably then put my parsimony code also into a single file module. I think we can directly merge it, but I'd like @PhyloStar to also have a quick look, as you'll probably later want to add additional tree handling functions.

xrotwang commented 3 years ago

From what I've seen so far, parsimony_down could be a case, where nodes are accessed quite a bit. But then, a node dict could just be created and used within this function. This approach would break down, if we run parsimony_down multiple time on the same tree - but is this a common use case?

LinguList commented 3 years ago

parsimony_down is usually a lot applied to the same tree, if you have many characters, but since we only need the traversal, so we only need the post- and the pre-order, we could even do simply with those lists. So I think this should not be a problem.

LinguList commented 3 years ago

so I'd say we merge this now and see later, if problems turn up with performance, how to optimize them!

PhyloStar commented 3 years ago

I am still digesting the Tree class code. I have few questions, comments, features:

LinguList commented 3 years ago

Branch lengths are suppored in python-newick, as far as I understand.

PhyloStar commented 3 years ago

parsimony_down is usually a lot applied to the same tree, if you have many characters, but since we only need the traversal, so we only need the post- and the pre-order, we could even do simply with those lists. So I think this should not be a problem.

Yes, we can store in a list the order of the nodes in the tree. Then, apply parsimony_down or likelihood function. Whenever we manipulate the tree structure, then we update the postorder list of the nodes so that the next likelihood function computation can be done. The likelihood function calculate is here: https://github.com/pylogeny/bayes/blob/master/src/cybayes/commands/ML_scaled_cache.pyx#L68

xrotwang commented 3 years ago

@PhyloStar thanks for the update on what's needed! That puts things into perspective - because right now, the Tree class isn't much more than a newick.Node with labelled internal nodes plus

{n.name: n for n in tree.walk()}

So, probably not worth a separate package.

xrotwang commented 3 years ago

@PhyloStar the link to the adjacency list code seems to be the same as for the branch lengths parsing. Could you point me to the "real thing"?

PhyloStar commented 3 years ago

@PhyloStar the link to the adjacency list code seems to be the same as for the branch lengths parsing. Could you point me to the "real thing"?

We need to compute a transition probability matrix for each edge by calling this function here which computes F81 model prob. matrix for each branch. https://github.com/pylogeny/bayes/blob/master/src/cybayes/commands/mcmc_gamma.pyx#L878

If we want to modify an edge this function is called: https://github.com/pylogeny/bayes/blob/master/src/cybayes/commands/mcmc_gamma.pyx#L109

LinguList commented 3 years ago

Okay, I'll merge for now and later we can discuss updates in issues!