vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.09k stars 193 forks source link

Viewing alignments without JSON output #353

Open hgibling opened 8 years ago

hgibling commented 8 years ago

Hello,

There seems to be a step missing in the wiki on viewing, in the viewing alignments section. In the last line of code, cyclic/reverse_self.vg comes out of nowhere. I can view the alignments to the graph by doing vg view -d t.vg -A t.gam | dot -Tsvg -o aln.svg, but the alignments are in JSON. Is there a way to get just the nucleotides, like in the example graph in that section?

ekg commented 8 years ago

We could add an option to output the actual sequences. This ended up being most used for debugging. Rendering the json version of the mapping made this possible without completely destroying the graph sequence space visualization.

It would be cool if we could render each node an alignments as a mutually gapped MSA as in tview. This is on my brain as I've been working on hhga a lot this week. This also would be relevant for variant calling if the function was abstract enough.

What are you using the visualization for?

On Fri, May 20, 2016, 20:26 Heather Gibling notifications@github.com wrote:

Hello,

There seems to be a step missing in the wiki on viewing, in the viewing alignments https://github.com/vgteam/vg/wiki/visualization#viewing-alignments section. In the last line of code, cyclic/reverse_self.vg comes out of nowhere. I can view the alignments to the graph by doing vg view -d t.vg -A t.gam | dot -Tsvg -o aln.svg, but the alignments are in JSON. Is there a way to get just the nucleotides, like in the example graph in that section?

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/vgteam/vg/issues/353

hgibling commented 8 years ago

Right now just for visualizing how everything works and explaining genome graphs to others, but it would be great if there was something for variant calling--that's the bulk of my project.

ekg commented 8 years ago

There is a pipeline in place but it was implemented quickly to scale the system. We are trying to improve it and it's turning out to be no easier than getting variant calling to work on a linear reference.

On Fri, May 20, 2016, 20:56 Heather Gibling notifications@github.com wrote:

Right now just for visualizing how everything works and explaining genome graphs to others, but it would be great if there was something for variant calling--that's the bulk of my project.

— You are receiving this because you commented.

Reply to this email directly or view it on GitHub https://github.com/vgteam/vg/issues/353#issuecomment-220690126

hgibling commented 8 years ago

Ok, hope to see something in the future :)

adamnovak commented 8 years ago

OK, the wiki previously had this:

vg construct -v tiny/tiny.vcf.gz -r tiny/tiny.fa >t.vg
vg index -x t.xg -g t.gcsa -k 11 t.vg         
vg sim -l 20 -n 10 -e 0.05 -i 0.02 t.vg  >t.reads
vg map -r t.reads -x t.xg -g t.gcsa -k 22 >t.gam
vg view -d cyclic/reverse_self.vg | dot -Tsvg -o aln.svg

Looks like we're drawing a completely different graph than the one we just built. I changed it to this:

vg construct -v tiny/tiny.vcf.gz -r tiny/tiny.fa >t.vg
vg index -x t.xg -g t.gcsa -k 11 t.vg         
vg sim -l 20 -n 10 -e 0.05 -i 0.02 t.vg  >t.reads
vg map -r t.reads -x t.xg -g t.gcsa -k 22 >t.gam
vg view -d t.vg -A t.gam | dot -Tsvg -o aln.svg

If you want a simpler view of the alignments, I think adding the -S option will simplify both the graph and the alignments.

This issue then boils down to a request for an option for vg view -d that draws just the sequence ion the alignment nodes

hgibling commented 8 years ago

If you want a simpler view of the alignments, I think adding the -S option will simplify both the graph and the alignments.

That simplifies it too much for me (just node numbers) :) Thanks for updating the wiki

EmperorDali commented 4 years ago

Hi all - I wanted to check in on the status of the issue.

@samlhao and I are trying to replicate this example alignment visualization in the wiki:

image

In particular, we're interested in seeing the aligned sequences overlaid on the sequences of the nodes.

However, using the commands in the wiki:

vg construct -v tiny.vcf.gz -r tiny.fa >t.vg
vg index -x t.xg -g t.gcsa -k 11 t.vg         
vg sim -l 20 -n 10 -e 0.05 -i 0.02 -x t.xg t.vg  >t.reads
vg map -T t.reads -x t.xg -g t.gcsa >t.gam
vg view -d t.vg -A t.gam | dot -Tsvg -o aln.svg

on v1.19.0-51-g6f0e168d7 "Tramutola" produces a visualization where the alignment nodes are labeled with JSON protobuf blocks, rather than the actual sequence as displayed in the Wiki:

image

Examining the code in vg/vg.cpp, it looks like the sequence for the alignment nodes is never generated - on line 5975, the protobuf of the alignment is just dumped to mapid and eventually written:

https://github.com/vgteam/vg/blob/a66dd273851efdff6240c5a12e878301f162f680/src/vg.cpp#L5973-L5980

Our question:

How was the visualization in the Wiki, the first image in this comment, generated? Is it possible to visualize an alignment where the sequences of the aligned nodes are visible, like in the Wiki?

ekg commented 4 years ago

It's an older version of the code doing this visualization. It would be a small project to extract the alignment sequence in place of the json if you're interested in doing it. It might even be possible to pick up in the git log how it was done before.

On Tue, Oct 22, 2019, 19:32 David Dynerman notifications@github.com wrote:

Hi all - I wanted to check in on the status of the issue.

We are trying to replicate this example alignment visualization in the wiki:

[image: image] https://user-images.githubusercontent.com/10357776/67317325-cc4be400-f4be-11e9-93af-4e8c5d127f74.png

In particular, we're interested in seeing the aligned sequences overlaid on the sequences of the nodes.

However, using the commands in the wiki:

vg construct -v tiny.vcf.gz -r tiny.fa >t.vg vg index -x t.xg -g t.gcsa -k 11 t.vg vg sim -l 20 -n 10 -e 0.05 -i 0.02 -x t.xg t.vg >t.reads vg map -T t.reads -x t.xg -g t.gcsa >t.gam vg view -d t.vg -A t.gam | dot -Tsvg -o aln.svg

on v1.19.0-51-g6f0e168d7 "Tramutola" produces a visualization where the alignment nodes are labeled with JSON protobuf blocks, rather than the actual sequence as displayed in the Wiki:

[image: image] https://user-images.githubusercontent.com/10357776/67317613-2b115d80-f4bf-11e9-820b-4cd9638bc9ee.png

Examining the code in vg/vg.cpp, it looks like the sequence for the alignment nodes is never generated - on line 5975, the protobuf of the alignment is just dumped to mapid and eventually written:

https://github.com/vgteam/vg/blob/a66dd273851efdff6240c5a12e878301f162f680/src/vg.cpp#L5973-L5980

Our question:

How was the visualization in the Wiki, the first image in this comment, generated? Is it possible to visualize an alignment where the sequences of the aligned nodes are visible, like in the Wiki

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vgteam/vg/issues/353?email_source=notifications&email_token=AABDQEIS4A2J5WDF63SKW43QP5BKPA5CNFSM4CENZRAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB6X2AQ#issuecomment-545094914, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEK4P7Y3ZV7QLMNUR7DQP5BKPANCNFSM4CENZRAA .