Open shimiao123 opened 1 year ago
Yeah, some variants can move around during graph construction. Also, while calling, variants that are side-by-side may be called together, leading to sites with slightly different coordinates.
There is a way to genotype your input vcf with vg call
, but I'm not sure it works with autoindex (@jeizenga would be nice to add this recipe).
You'd need to do
vg construct -a -r ref.fasta -v test.vcf > test.vg
vg index test.vg -L -x test.xg
vg index test.vg -v test.vcf -G test.gbwt
vg gbwt -x test.xg test.gbwt -g test.gbz --gbz-format
vg index test.gbz -j test.dist
vg minimizer test.gbz -d test.dist -o test.min
After that you'd run giraffe
on the .gbz / .dist / .min
files, and then pack
and call
using the .xg
. When you run call
, you can then pass in test.vcf
with call -v
and it will give you genotypes exactly on the input vcf.
Thanks for your advise. Follow your advise, I ran by the command: step1:vg construct -a -t 60 -r ref.fasta -v test.vcf > test.vg step2:vg index test.vg -t 60 -L -x test.xg step3:vg index test.vg -t 60 -v test.vcf -G test.gbwt step 1 and 2 can normally run without any error,but step 3 with error like this: error: [HaplotypeIndexer::parse_vcf] variant file 'test.vcf' does not contain phasings.
My vcf file like this;
......
chr01 13093 . T TGGTGGCTACTCCAATGAAGATTTAATGATTATCATCATGTGAAGATACATCATTTTGACCATTGGATGATAGATTGTAGGGCTTGATTTGATTTGAAGTAAAAGTGTTACTTTTATTTAAAATGTTGCTAAATAAATAAGTTACACTTTTTGAACAAGGATCATCATGAGAAGATGTTTTTGCCATCTTCATGGGAGTAGTTGCCATAAAAAAT . . SVTYPE=PAV L542 L542,90632,90846,215 Please tell me what happends and what can I do for this?
Yeah, that pipeline requires the input variants be phased.
So,for my data,how can I get the same variant positions between my input test.vcf and output ICG10468.vcf? I have 5 samples for calling variants, I need know which are the common shared variants, how can I get the result?
hi vg team:
vg autoindex --workflow giraffe -r ref.fasta -v test.vcf -t 30 -p Ah -T ./tmp vg giraffe -t 60 -Z Ah.giraffe.gbz -m Ah.min -d Ah.dist -f test_R1.fq.gz -f test_R2.fq.gz -p > ICG10468.gam vg pack -t 60 -x Ah.giraffe.gbz -g ICG10468.gam -Q 5 -s 5 -o ICG10468.pack vg call Ah.giraffe.gbz -a -t 60 -k ICG10468.pack > ICG10468.vcf
I confused about the output vcf file. Some variant positions same between input test.vcf and output ICG10468.vcf, but some positions difference. Why? What happends? example: Some variant positions in input test.vcf file: chr01 2488160 ... chr01 2502400 ... chr01 2504303 ... Some variant positions in output ICG10468.vcf file: chr01 2488160 ... chr01 2502400 ... chr01 2504304 ... hope for your reply.
Bests