Closed hyanwong closed 6 months ago
Thinking about this more, I'm coming down on the side of sorting the .iedge_map_sorted_by_parent
array by parent_chromosome then max(parent_left, parent_right), with ties broken by child id.
Now sorting by max(left, right) when looking at parents
At the moment we claim that iedges are sorted by their child node time and ID, then child_left coordinate. In fact, since edges for a given child can't overlap, this is identical to sorting by child_right.
However, if we also want to be able to iterate over iedges with the same parent ID, it does make a different whether the ordering uses the left or right coordinate. The advantage to sorting by the right coordinate is that we can easily find the maximum genome length (which would be the last edge for that chromosome).
it might be that use we are focusing on edges with the same parent id, we would want to create an index sorted by parent_right (or left), but this is potentially trickier because of inversions? I guess it would be possible to sort by max(parent_left, parent_right).