vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.07k stars 191 forks source link

Can vg construct a trans-chromosomal graph ? #4237

Open dudududu12138 opened 3 months ago

dudududu12138 commented 3 months ago

Hi, If I have a series of interchromosomal translocation events stored in .vcf file, can I construct a graph that represent this kind of variants using vg ? If vg cannot solve this problem, can other tools such as pggb or minigraph cactus do this? Thank you!

jeizenga commented 3 months ago

Sorry,vg construct does not currently support translocation variants, and I'm not aware of another tool that does. Both PGGB and Minigraph-Cactus construct graphs from assembled genome sequences, not VCFs.

adamnovak commented 3 months ago

It would be great if we could do this, but it causes a lot of trouble for the parallel-by-contig, streaming architecture we use in the constructor. So I think this might be best implemented as a new algorithm for turning those events into graphs.

VCF 4.4 now is supposed to have much better support for breakend alleles and allows actually phasing them with the PSL tag. Either vg could learn to interpret them, or someone (@dudududu12138?) could writhe a breakend and PSL VCF 4.4 to GFA converter.

dudududu12138 commented 3 months ago

It would be great if we could do this, but it causes a lot of trouble for the parallel-by-contig, streaming architecture we use in the constructor. So I think this might be best implemented as a new algorithm for turning those events into graphs.

VCF 4.4 now is supposed to have much better support for breakend alleles and allows actually phasing them with the PSL tag. Either vg could learn to interpret them, or someone (@dudududu12138?) could writhe a breakend and PSL VCF 4.4 to GFA converter.

Thans for your reply. If I want to represent fusion gene with graph model, is it reasonable to use vg rna? I manually modify the gff file? Cause most of the fusion genes are not on the same chromosome, so I am not sure if it is workable.

dudududu12138 commented 3 months ago

By the way, can I just construct a spliced reference graph? There are no variants in this graph, just spliced nodes. I just provide the reference(.fa) and the annotation file(.gff). Can vg rna do this? Or can vg construct a flat graph with just reference genome and no variants? I use the latest vg , it seems that the vcf file is necessary.

jeizenga commented 3 months ago

I think it would be difficult to use vg rna for fusion genes as it's currently implemented. Apart from building an initial graph with fusion mutations, vg rna assumes that a GTF record is expressed against a single contig. You would have to do something like adding a new path representing the fusion and re-expressing the transcripts against the fused path.

Regarding the spliced reference graph: that is currently possible. You can add splice junctions to a graph that contains no variants in vg rna.