amnh / PCG

š™‹š™š™®š™”š™¤š™œš™šš™£š™šš™©š™žš™˜ š˜¾š™¤š™¢š™„š™¤š™£š™šš™£š™© š™‚š™§š™–š™„š™ āøŗ Haskell program and libraries for general phylogenetic graph search
28 stars 1 forks source link

Publish improved implied aligment algorithm on binary trees #31

Closed recursion-ninja closed 4 years ago

recursion-ninja commented 7 years ago

We should define the straightforward way to derive the implied alignment(s) on a phylogenetic network. This should take into account the optimal "rooting" of the dynamic character in the phylogenetic network. Lastly should perform a comparison in terms of time and scoring between alignment generated from an implied alignment in a phylogenetic network and those alignments generated from more classical sequence alignment software.

During the summer of we discovered a way to remove a factor of |E| from the asymptotic time complexity of the implied alignment algorithm by exploiting the monoid (possibly only magma) structure of dynamic characters. This allows the implied alignments for each dynamic character in a binary tree to be computed after a post-order then pre-order pass. The best previous know method involved an "incremental" pre-order pass with "back propagation," modifying previous results. Since this process was "iterative," at edge there is the possibility of O(|E|) work causing the algorithm to run in O(k*|E|^2) time where k is the cost of aligning two dynamic characters. Our new method of deriving the implied alignment for each dynamic character in a binary tree runs in O(k*|E|) time.

We can look to Ford & Wheeler's paper for structure and a starting point for the comparisons.

recursion-ninja commented 6 years ago

I've been giving this a lot of thought and I think we can perform the direct optimization preliminary assignments in a O(n * k^2) post order traversal and then both the direct optimization final assignments & the implied alignment in a single O(n * k) pre-order traversal. It requires storing a bit of extra alignment context information on the post-order, but we let that information leave scope currently.

recursion-ninja commented 4 years ago

This has been published!

DOI:10.1186/s12859-020-03595-2

See the complete algorithm and paper here.