vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.07k stars 191 forks source link

Contiguous subpaths cause error #4254

Closed glennhickey closed 2 months ago

glennhickey commented 3 months ago

Here's what looks like a normal graph to me (subranges are bed-like, so 0-based open end)

H   VN:Z:1.1
S   1   A
S   2   A
S   3   A
P   s#0#c[0-2]  1+,2+   *
P   s#0#c[2-3]  3+  *
L   1   +   2   +   0M
L   2   +   3   +   0M

But vg will refuse to parse it (or its W-line equivalent)

[gfa] error: cannot write multiple phase blocks on a sample, haplotyope, and contig in GFA format when paths already have subranges. Fix path s#0#c#0[2-3]

Maybe a result of this recent change @adamnovak ? https://github.com/vgteam/vg/commit/726ab66b070082cfee032b3ea8b1a7938990d223

adamnovak commented 3 months ago

OK, I think I have a fix for this in #4255. We were seeing that path as having a phase block #0 on the end because it's not a reference sample, and then we were assuming we had to make fake coordinates for all the phase blocks while also thinking we could use the subranges provided since it's phase block 0.

With the PR, we shouldn't make fake coordinates unless we see a path that doesn't already have a subrange.