Open shilpagarg opened 7 years ago
Snarls have an equivalent representation with both node's reversed and the start swapped with the end. It's probably not a bug, but you might want make a visualization of the graph to be sure.
I would expect everything to be in forward direction because I constructed vg graph using VCF which is left to right.
Attached is example 1 picture. ex85.pdf
In case you are interested in more, you can just do vg find -n
As far as I can tell, this is a case of the representational equivalence I referred to, not a bug. I'll be more specific. Both of these are equivalent Snarls:
{"type": 1, "start": {"node_id": 85, "backward": true}, "end": {"node_id": 88, "backward": true}}
{"type": 1, "start": {"node_id": 88}, "end": {"node_id": 85}}
The invariant is that the "start"
points into the Snarl and the "end"
points out of the Snarl. The strandedness of the Snarl is arbitrary.
Most probably, we need right orientations in the assembly graph, instead arbitrary ones. For now, I can handle it for vg constructed from freebayes VCF because I know it is left to right. But we need to get the orientations right for assembly graphs. Do you agree?
You can just reverse the snarl - create a new snarl, set its end to the original's start and its start to the original's end, then add the contents to the new snarl in reverse order and set is_reverse to false. These two snarls are considered identical: direction in/out isn't a defining characteristic of a snarl. You can loop over all snarls in the graph to set them in the forward direction if you'd like. Snarls_main does this when outputting them in "sorted" order.
The orientations will matter for paths / SnarlTraversals when calling variants (as that will be coming from your reads, and they'd represent different things in forward/reverse).
I constructed vg graph using freebayes VCF and why we detect snarls with node_id in backward direction? I would expect everything to be forward, is it not?
Is it some sort of bug? Or I am missing some logical details?
I also looked the node sequence 21317
It is present in reference genome, why it is in backward? Moreover, why biallelic SnarlTraversals are in backward?
Here is the graph and its corresponding snarls: https://transfer.sh/13Kcwr/yeast.illumina.SK1_Y12.covall.chrI.freebayes.X.vg https://transfer.sh/OqVcB/yeast.illumina.SK1_Y12.covall.chrI.freebayes.X.xg https://transfer.sh/5k0nm/yeast.illumina.SK1_Y12.covall.chrI.freebayes.X.snarls