pylogeny / tree

Basic functionalities for handling phylogenetic trees, expanding the newick package
Apache License 2.0
0 stars 0 forks source link

Reimplement add_edges as node visitor #1

Closed xrotwang closed 3 years ago

xrotwang commented 3 years ago

Rather than parsing the newick string here https://github.com/pylogeny/tree/blob/d5e2bca1c272cbbe36571401e7cdd20f7fcf2efb/src/pylotree/tree.py#L5 add_edges could be implemented as node visitor for newick.Node.visit:

>>> import newick
>>> t = newick.loads('((a,b),c);')[0]
>>> class EdgeLabels:
...     def __init__(self):
...         self.count = 0
...     def __call__(self, node):
...         if not node.name:
...             if not node.ancestor:
...                 node.name = 'Root'
...             else:
...                 self.count += 1
...                 node.name = 'Edge{}'.format(self.count)
... 
>>> t.visit(EdgeLabels())
>>> t.newick
'((a,b)Edge1,c)Root'
LinguList commented 3 years ago

Very cool, we should definitely use that!

xrotwang commented 3 years ago

I can have a look at the package later, and add this.

LinguList commented 3 years ago

Thanks, that would be very cool. The basic functionality is not much divergent from python-newick, but adds this dict-access, which is useful for parsimony, and we may also use the code for tree differences, which @PhyloStar has in other projects, plus add functionalities for tree traversals later.

xrotwang commented 3 years ago

Any reason why the repos is called tree and not pylotree like the package?

LinguList commented 3 years ago

Aesthetic reasons only. I also have the plan to use the tree namespace when wrapping these tiny packages into pylogeny/pylogeny. So one can import pylogeny.tree, etc. But maybe it is not that good an idea, I am not sure.

xrotwang commented 3 years ago

Ah, ok. So the plan is a namespace package pylogeny with a subpackage tree? Should I move the tree package there already?

LinguList commented 3 years ago

I was not sure how to do this best. My idea was to have these sub-packages as standalones, so one could use the Tree package, for example, in LingPy, where it might be needed for reference tree computation, as well as the cluster package, but where the bayes package would be too much. So each package should be standalone, on pypi, but for convenience, pylogeny/pylogeny will use these as dependencies and add access for reading and writing, which the individual packages do not provide.

LinguList commented 3 years ago

So pylogeny/tree would have the standalone namespace pylotree, so it can be posted on pypi and re-used in other projects.

xrotwang commented 3 years ago

If pylotree is a separate package, I don't think it should be repackaged in pylogeny, but rather required and imported, no?

LinguList commented 3 years ago

Yes, that is what I meant, sorry for confusion, you can see a draft here.

LinguList commented 3 years ago

So essentially, pylogeny/pylogeny depends on the packages pylogeny/tree aka pylotree, pylogeny/cluster aka pylocluster, etc., the shortening of the suffix was merely for aesthetic reasons on github.

xrotwang commented 3 years ago

ok. But then I'd rename this repos to pylotree. I find it confusing when repos and package have different names - as is the case with python-newick :)

LinguList commented 3 years ago

Yes, I see the point. This would mean, we should then rename all repos (pylocluster, etc.). But that's fine. @PhyloStar, so our new convention is then: pylo is always the prefix, and concordance would be pyloconcordance, also on github.

xrotwang commented 3 years ago

Edge is a bit of a misnomer here, though, right? It's rather internal nodes that are labelled. Or is it common to use the node name in a tree as label for the incoming edge?

Johann-Mattis List @.***> schrieb am Di., 4. Mai 2021, 11:48:

Yes, I see the point. This would mean, we should then rename all repos (pylocluster, etc.). But that's fine. @PhyloStar https://github.com/PhyloStar, so our new convention is then: pylo is always the prefix, and concordance would be pyloconcordance, also on github.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pylogeny/tree/issues/1#issuecomment-831817440, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUOKBHI53YP4EFIBXCCJDTL67FVANCNFSM44CKQ4RA .

LinguList commented 3 years ago

Yes, this is from pycogent, they call it Edge. So calling it Node is also fine. But the phylogeny folk uses Edge for node, you can see that from the code by @PhyloStar ;) But I also prefer "Node".

xrotwang commented 3 years ago

Ok, I'll send a PR

PhyloStar commented 3 years ago

Yes. Node/Edge confusion. :(

PhyloStar commented 3 years ago

Yes, this is from pycogent, they call it Edge. So calling it Node is also fine. But the phylogeny folk uses Edge for node, you can see that from the code by @PhyloStar ;) But I also prefer "Node".

I found it confusing when working with Tree package. I prefer Node for internal nodes.

@LinguList Is the code in bayes package? I thought I am using numbers for internal nodes.

LinguList commented 3 years ago

No, I don't think it is in there.