Closed edawson closed 8 years ago
Need to take from the command line the head and tail nodes to join. The problem is that we can't always do this automatically; we don't necessarily want to link all the head and tails nodes.
Another related problem is indicating that a specific alignment is circular. This would be a flag on the path.
On Wed, Mar 30, 2016 at 4:07 PM Eric T. Dawson notifications@github.com wrote:
I think it'd be nice to have a single command to circularize a graph by creating an edge between its head and tail nodes. Right now I do this manually, taking the graph through GFA, but it turns out I'm doing it more often than I initially expected.
I'll work on this tonight; it's probably all of five lines of code.
— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/vgteam/vg/issues/281
So I have a thought or two on this. As @ekg handily pointed out, we may have some paths in a graph which are circular and some that are not. One example I can imagine of this is if we place an HPV decoy in a human graph. It is therefore important to circularize only specific paths.
I could solve this by taking in a file that looks like this:
ChromA circular
ChromB circular
ChromC linear
but it's a bit ugly and I want some opinions on it. In the mean time, I'll try generating a circular pan-HPV graph using MSGA to see if it's feasible/useful.
Maybe it would be simpler to have a vg mod command that takes a list of paths in a file and ensures that they are circular in the graph. So this would mean setting a flag in the path object to say if it is circular or not and also adding some edges to allow the circular path to be fully embedded in the graph.
For msga we could take an argument which lists the circular paths by name. This would be ideal as it makes the assembly circular genome aware. Otherwise we will get divergent alignments at the starts and ends of the sequences.
So I guess vg::VG should have a function which circularizes an embedded path and ensures that we can traverse all parts of it in the graph by linking the head and tail positions with an edge if such does not exist already.
And also vg.proto should be patched to have a is_circular boolean flag on the Path object.
I've got some more extensions to this.
I'm on it!
Next extension: MSGA should handle circular paths.
I think it'd be nice to have a single command to circularize a graph by creating an edge between its head and tail nodes. Right now I do this manually, taking the graph through GFA, but it turns out I'm doing it more often than I initially expected.
I'll work on this tonight; it's probably all of five lines of code.