matsengrp / larch

Inference and manipulation of history DAGs
2 stars 2 forks source link

more direct integration with matOptimize #5

Closed matsen closed 1 year ago

matsen commented 2 years ago

Right now the idea is that we will run some external program to give us trees, then we will load them in, and then build a DAG.

We'd like to be able to work more closely with matOptimize to do this in the context of a single program.

matsen commented 2 years ago

@ognian- @willdumm -- @yatisht just described that the usher-optimize-dev will soon be the main version of matOptimize. It supplies a program called usher-sampled. Let's focus our attention on that moving forward (Will, sorry about only remembering this now).

ognian- commented 2 years ago

Automatic building of Usher as dependency is added to branch 5-integration-with-matoptimize

willdumm commented 2 years ago

matOptimize proposes and applies SPR moves of various radii. This ugly drawing shows how an example SPR move with radius 2 changes the hDAG. This will add three new nodes to the DAG (unless they already exist), and in this case seven new edges. Parts of the DAG highlighted in yellow do not change after the SPR. The top row shows an original tree and its corresponding history DAG. The bottom row shows the tree after the SPR (which moved the clade B next to the clade C), and the new corresponding history DAG.

Mutations necessary on these new edges can be read from matOptimize, but in order to properly merge the new nodes into the existing MAD, compact genomes and child clade sets must be pre-computed. Compact genomes and child clade sets for the new nodes can be computed quickly relative to adjacent existing nodes in the MAD.

It seems like feedback about whether a proposed SPR's nodes are already in the hDAG could be used to inform matOptimize's choices in the future.

image

willdumm commented 1 year ago

We've left this issue in the dust!