cucapra / pollen

generating hardware accelerators for pangenomic graph queries
MIT License
24 stars 1 forks source link

Fix inject #84

Closed anshumanmohan closed 1 year ago

anshumanmohan commented 1 year ago

I have identified and fixed the issue with inject!

We already know that injecting one subpath triggers (up to) two tiny chop-like operations. These tiny chops happen on (up to) two segments traversed by the original path. After performing these chops, we can:

But what if we need to tiny-chop a segment that was itself being traversed in the reverse orientation? Given the graph

S   1   ATG
S   2   CCCC
P   x   1-,2+   *

and the BED file

x   0   1   y

the correct output is

S       1       AT  // ?!
S       2       G   // ?!
S       3       CCCC
P       x       2-,1-,3+        *   // changed in place
P       y       2-      *       // ?!

Segment 1 needed to be chopped, but the point at which we chopped segment 1 is perhaps surprising. The link on path y is perhaps surprising. The way that path x has been fixed up is not surprising if we accept the chop-point of segment 1.

The explanation is this. The original path x was traversing segment 1 in the reverse direction, meaning that, when the BED file requested a new path y that tracked x from index 0 to index 1, the path y wanted the character G (reading segment 1 backwards) and not the character A (reading segment 1 forwards).

This problem was only cropping up in the larger graphs because those happen to have more reverse-oriented segment traversals. The issue has nothing to do with flipped paths, and nothing to do with graph size.