matsengrp / historydag

https://matsengrp.github.io/historydag
GNU General Public License v3.0
0 stars 1 forks source link

Indexing histories #3

Closed willdumm closed 1 year ago

willdumm commented 2 years ago

Right now, the only sort of index we have for histories in the DAG are Newick strings. Since the DAG can be trimmed to express only histories with a given Newick string, this works as an index. However, finding Newick strings of all histories in the DAG is slow.

We should implement a numeric index based on node-clade pair target indices.

willdumm commented 2 years ago

Here's a little more on the idea that I had in mind for this.

It's a little silly, but if you're given a binary tree with N leaves, you can find any leaf by its index (an integer between 1 and N) by doing the algorithm outlined in the attached pdf.

This can be generalized to multifurcating trees (in which a node may have any number of children, not just two) without much modification.

Can we use this idea as inspiration for finding a tree in a history DAG given an integer index? The algorithm may be much more complicated, but I think it can be done.

History DAG 4.pdf

willdumm commented 2 years ago

Here's that pdf as an image, so it shows up on the comment hdag

willdumm commented 2 years ago

The next step towards being able to do something similar in the history DAG is, perhaps, coming up with an analogous way to index sub-trees of this structure which have exactly one black and one red edge descending from each node: image