matsengrp / ecgtheow

Ancestral lineage reconstruction using BEAST or RevBayes
2 stars 2 forks source link

We need blobs! #14

Open matsen opened 6 years ago

matsen commented 6 years ago

Since our last meeting, I've been pondering what to do with highly uncertain regions of the graph, such as this beast copied from @lauranoges ' thoughts on #12 :

screenshot 2017-11-05 at 5 44 46 am

In our last conversation I realized that this could happen when even in the absence of tree topology uncertainty-- it can just be from having long branches and a given branch not knowing where to attach. In this case we can have substantial uncertainty high in the graph, even though it contracts back down to more certain nodes lower in the graph. Even though this isn't a problem in principle, in practice it makes our graphs impossible to look at and interpret.

Recall that each node in the graph represents exactly one sequence. We might have more interpretable results by generalizing such that nodes can represent a collection of similar sequences. I propose that we call these "blobs", and use some other shape to represent their contracted forms in our graph.

How do we find an appropriate blobification? My sense is that we want to infer a blob when flow through the directed graph first separates and then comes back together under Markov flow. This kind of reminds me of MCL but it's simpler because we mostly have a DAG here. That algorithm doesn't work with directed graphs.

We can think about that more, but assuming we can do that does this seem like a good idea?