Sankoff with multiple criteria

It might be useful to generalize the Sankoff algorithm to admit both sequence and geographic data.

This can be done naively for the 2-region case by appending a binary-value character to the existing sequences that represents the geographic region of interest. However, there are 2 limitations to this that we would like to get around:

We would like to allow for more than 2 geographic regions.
We might want to weight the cost of different geographic regions differently from the transition weights for the bases of the actual sequences.

One way to get around this would be to create separate node attributes, as @willdumm suggests in this comment with cost matrices for these.

In this case, we would need to couple the attributes carefully. A potential problem when optimizing cost using multiple criteria is that it is possible for the overall cost to achieve an optimal value at a combination that is not a local optimal for any of the attributes independently. That is, the overall cost of the choice at a given node should be realized for the entire set of attributes on each tree below it.

matsengrp / historydag

Sankoff with multiple criteria #55