Open adamnovak opened 4 years ago
Yes, alt paths are needed to genotype VCFs with vg call
. For any graph derived from VCF, it's a nice feature to be able to express genotypes exactly in terms of the input VCF(s).
That said, I'm not sure how important it is to @jmonlong right now. Without alt paths, you can get more-or-less equivalent genotypes, it's just they'll be shifted around and merged and whatnot when compared to the VCFs used to create the graph.
Hi there!
I would like to know if there is a way to incorporate VCF alternate paths into an existing pangenomes GFA, VG or XG file. Given that vg add
doesn't support this option, I was thinking of:
vg construct -A
with FASTA and VCF filesvg combine
Is there a more efficient way to do this? Also, is there a reason why vg add
is now deprecated?
Thanks for your time!
Best regards, Nuno
@nuno-agostinho I don't think that will work; vg combine
just puts both graphs floating next to each other as if they were separate sets of chromosomes; it doesn't weld the graphs together along shared paths. And in fact it might fail if the graphs have paths with the same names in them.
@glennhickey might have something in the Cactus universe that could do the required welding operation. It's well within what the pinchesAndCacti
library could be used to efficiently compute, but I don't know if any command-line tools exist to do it.
We marked vg add
as deprecated not because we have a better way to do what it does, but because it doesn't do what it does very well and because we don't need to do it enough to justify filling in the gaps like the missing alt paths.
Hi @adamnovak, thanks for the heads up about vg combine
.
I am looking into this to explore how to efficiently support sets of genetic variants from VCF files with pangenome graphs in Ensembl Variant Effect Predictor (VEP) and the Ensembl website.
I like how vg construct -A
integrates variants from a VCF, including the alternative allele as alt paths. Something like the vg add
command with support for adding alt paths for VCFs seemed to have potential for our use case, but I can continue looking for alternatives.
Thanks for your input!
@jmonlong says we use
vg add
for some structural variant graph stuff.Looking at the code,
vg add
does not appear to create paths for the alleles of the variants it adds.This means we can't find them for GBWT indexing, but I don't think we ever want to make GBWT indexes for variants we pull in with
add
anyway, since they aren't phased with variants from the original VCF. But it also means that genotyping modes that use the alt paths won't work.@glennhickey we do use the alt paths for some kinds of genotyping, right? Is it worth adding in support for them in
vg add
?