Closed YibinQiu closed 9 months ago
Can you please provide the output of vg paths -Mx out.gbz
? It looks like your graph has now reference paths, which is strange since it comes from vg construct
Can you please provide the output of
vg paths -Mx out.gbz
? It looks like your graph has now reference paths, which is strange since it comes fromvg construct
@glennhickey Yes, the path of the reference (like chr1, chr2, chr3, etc.) seems to have been lost, leaving only the path information of the individuals, which is very strange. The output looks like this (I also uploaded the file):
I also test vg paths -Mx out.vg
. The output looks like this:
Perhaps I should follow these steps to build gbz? https://github.com/vgteam/vg/wiki/VG-GBWT-Subcommand (I'm trying it, and the process is still ongoing):
Now, I use out.xg to vg call, When I ran a single instance, it took 5 hours, but when I submitted 50 tasks simultaneously on Slurm, it has been running for 14 hours and the tasks are still not finished.
vg call -t 24 ${sample}_out.xg -r ${sample}_out.snarls -k ${sample}.pack -s ${sample} -a > ${sample}.vcf
If possible, I would prefer to use out.gbz instead of out.xg, as it should save a lot of time, right?
I follow the steps to build gbz( https://github.com/vgteam/vg/wiki/VG-GBWT-Subcommand):
vg construct -f -a -S -r $ref -v convert_merge.vcf.gz -p > out.vg
vg gbwt -x out.vg -v convert_merge.vcf.gz -o out.gbwt -p --temp-dir ${LARGE_DISK_TMP}vg gbwt -x out.vg -E -o out.paths.gbwt -p --temp-dir ${LARGE_DISK_TMP}
vg gbwt -m out.gbwt out.paths.gbwt -o out.combined.gbwt -p --temp-dir ${LARGE_DISK_TMP}
vg gbwt -x out.vg out.combined.gbwt --gbz-format -g out.gbz -p --temp-dir ${LARGE_DISK_TMP}
and then try vg paths -Mx out.gbz
again, the head of output looks like:
but when I grep 'GENERIC', the output seems having reference paths:
@glennhickey I would like to ask for your advices. Is it correct to do so for the above results?
I run vg call -t 24 out.gbz -r ${sample}_out.snarls -k ${sample}.pack -s ${sample} -a -z > ${sample}.vcf
, and it worked fine (~30 mins for one sample).
But when I use this out.gbz to vg call -v :
vg call -t 24 out.gbz -r ${sample}_out.snarls -k ${sample}.pack -s ${sample} -v convert_merge.vcf.gz -f $ref > ${sample}.vcf`
It output the empty vcf, the log file displayed:
[VCFTraversalFinder] Warning: No alt path (prefix=_alt_659226977429afe25da2e2b94abbb1a7516628ab_) found in graph for variant. It will be ignored:
chr1 20973 515:INS:chr1_20974 TA TGCCCTGGGCCCCGGCTGGCCCGTGCCCTGGCCCTG 60 PASS SVTYPE=INS;SVMETHOD=SURVIVOR1.0.7;CHR2=chr1;END=20974;CIPOS=0,0;CIEND=20974,0;STRANDS=+-
[VCFTraversalFinder] Warning: No alt path (prefix=_alt_b0b0a41960b57218cbc66182eaeaad643ad8e6b9_) found in graph for variant. It will be ignored:
chr1 21049 704:INS:chr1_21050 CT CAAAGAATGTAGATTTTAACTCTTTAACTTCAGATTTCTAATTT 60 PASS SVTYPE=INS;SVMETHOD=SURVIVOR1.0.7;CHR2=chr1;END=21050;CIPOS=0,0;CIEND=21050,0;STRANDS=+-
....
....
....
I have looked at several similar questions posed by others (#3963 #3974 #3950) and saw your answer; it seems that 'vg call -v' can only be used with xg.
A few things. If you want to use -v
with vg call
, you need to follow the instructions here.
It looks like you are doing this for the index, but this part here is important
You can now use test.gbz, test.dist and test.min as input for vg giraffe and test.xg as input for vg call.
Ie, you need to run vg call
on the xg
and vcf
in order for it to work (even though you ran
vg giraffe
on the gbz
).
If you do not want to pass in your vcf
with -v
then you can don't need to bother with this, and can just run call
on the .gbz you used for giraffe
(that you made here). To speed that up you can try vg call -z
Thank you for the help with this
1. What were you trying to do? I wish to manually construct and index the graph using vg construct -a and perform SV genotyping of an individual with second-generation sequencing data using vg call -v. I followed the commands as per the official vg documentation (https://github.com/vgteam/vg/wiki/SV-Genotyping-and-variant-calling):
2. What did you want to happen? The SV information contained in the VCF file generated by the vg call step corresponds to the VCF input
3. What actually happened? When running vg call out.gbz, the following error is reported:
error [vg call]: No reference paths found
When I replaced ${sample}_out.gbz with ${sample}_out.xg, vg call was able to run normally, but it displayed:
Following the suggestion #3950 , I also used :
but the result was the same, showing:
error [vg call]: No reference paths found.
So, I'm not sure which step I did wrong?Additionally, I previously tried using vg autoindex to construct the graph, and performing vg call on out.gbz with 15X second-generation sequencing data only took about half an hour. Now, using vg construct -a and vg call out.xg, I find that the runtime of vg call is too long (~5 hours).
4. If you got a line like
Stack trace path: /somewhere/on/your/computer/stacktrace.txt
, please copy-paste the contents of that file here:5. What data and command can the vg dev team use to make the problem happen?
6. What does running
vg version
say?