hyanwong / treeseq-node-identifiability

Examples and issues to assess identifiability of nodes in an arg / tree sequence
MIT License
0 stars 0 forks source link

Discuss requirement for a "fully simplified" tree sequence #1

Open hyanwong opened 1 year ago

hyanwong commented 1 year ago

We argued that it seems necessary that all nodes in the genealogy are always coalescent everywhere. The ability of the tskit format to collapse recombination nodes down into their nearest descendant coalescent node is a useful one here.

It also seems necessary that we always ascend up from a coalescent node, which implies that we shouldn't assess a shared breakpoint if it is associated with a portion of a node that is unary. Hence "full" simplification, (rather than `keep_unary_when_coalescent") should make our algorithm much easier to think about.