jasperlinthorst / reveal

Graph based multi genome aligner
MIT License
46 stars 3 forks source link

How can i output vcf format? #1

Closed sailgu closed 8 years ago

sailgu commented 9 years ago

HI , When i run the test, i only get the gfa file. Though i see the Introduction of GFA format by LiHeng, i still can't understand this format well. So I wonder if i can output vcf format file from reveal.

jasperlinthorst commented 9 years ago

Hi juyouhui, I've been working on a new version of reveal which is capable of generating multi-alignments of small genomes in a non-progressive manner. Unfortunately, the vcf output is missing in this version. You can get the old version by checking out commit fcdba9d3631e0404ce1a95feb063cf6b9a484ca8. I should have put a tag, sorry about that.

So, in order to get vcf output, do the following:

git clone https://github.com/jasperlinthorst/reveal git checkout fcdba9d3631e0404ce1a95feb063cf6b9a484ca8 python setup.py install

Then, create the alignment graph: reveal 1a.fa 1b.fa --> should produce 1a_1b.gfa Then, call variants in the alignment graph: reveal 1a_1b.gfa --> should produce 1a_1b.vcf.gz

On the other hand, I don't know how long/variable the genomes/sequences are that you're working with, but you might want to use the latest version of reveal (which I would recommend) to produce gml output, this can be done by supplying the --gml flag in your alignment. So for example:

reveal 1a.fa 1b.fa 1c.fa --gml

This will produce an alignment graph of the three sequences in gml format which can be imported in graph visualisation software like gephi and cytoscape. This might help in understanding the produced graphs a bit better and understanding the effects of the different parameters on the resulting alignment.

Let me know if you run into any problems, Jasper

sailgu commented 9 years ago

Thanks for your detail reply, I try the old version , and successed.

jasperlinthorst commented 4 years ago

reveal variants 1a_1b.gfa --vcf