amnh / PCG

𝙋𝙝𝙮𝙡𝙤𝙜𝙚𝙣𝙚𝙩𝙞𝙘 𝘾𝙤𝙢𝙥𝙤𝙣𝙚𝙣𝙩 𝙂𝙧𝙖𝙥𝙝 ⸺ Haskell program and libraries for general phylogenetic graph search
28 stars 1 forks source link

Reimplement character sequences into a character matrix. #85

Open recursion-ninja opened 5 years ago

recursion-ninja commented 5 years ago

Currently character sequences are stored on the nodes of the phylogenetic forest. This make column-wise analysis unwieldy. We should instead store the character sequences as rows in a "character matrix" which is separate from the graph topology. This will make column-wise operations (required for network analysis and metadata summaries) more efficient and less unwieldy. Will also make extracting single characters from the phylogenetic forest much easier.

The representation will likely be Vector (Character Sequence u v w x y z), with the vector indices corresponding to the indices of the reference vector of the topological representation.

recursion-ninja commented 5 years ago

We should explore generalizing the postorder and preorder passes into a continuation-passing-style memoization so that multiple pass logic can be combined into a single "fold."

Boarders commented 5 years ago

When we do this consider whether it is possible to use Unboxed Vectors either for the character matrix itself or for the CharacterSequences (I think the latter is more likely to make sense).