hyanwong / giglib

MIT License
4 stars 2 forks source link

Extracting haplotypes / DNA sequences #113

Open hyanwong opened 7 months ago

hyanwong commented 7 months ago

Once we figure out how to add mutations (#5), we need to be able to extract haplotypes. We can probably do this on a per-genome basis simply by traversing up each edge and stopping once we each interval has either collected mutations at every position, or has hit the root (whichever comes first).

Working out how to do this for a collection of haplotypes is more complicated, but we can probably go left-to-right on one of the genomes and fill in some of the mutations / allele identities in the others. It might be worth thinking how this maps to small indels first, rather than e.g. large duplications or inversions, which are likely to cause more complications.