ivan-krukov / aligning-genealogies

The genealogy-coalescent alignment project
3 stars 0 forks source link

Incorporate path completion to greedy Aligners #26

Open shz9 opened 4 years ago

shz9 commented 4 years ago

We need to figure out a number of heuristics to reconstruct the full traversal trajectory of a particular haplotype for the greedy algorithms. The greedy algorithms that we have implemented so far leave a lot of nodes unmatched. These nodes are often intermediate and are subsumed into the edges connecting the tree sequence nodes.

I already implemented a simple heuristic that works in a limited number of cases (e.g. skipped parents). The code is in the .complete_paths() method of MatchingAligner class.

Heuristic (1): If a pedigree node is sandwiched between 2 matched nodes, and there's an edge between those 2 nodes in the tree sequence, then assign the pedigree node to that edge.

I also added the ped_node_to_ts_edge mappings to our evaluation metrics.

Feel free to contribute ideas or implementations.