vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.1k stars 194 forks source link

Problem of vg giraffe #4149

Open Lucio-Yang opened 11 months ago

Lucio-Yang commented 11 months ago

Hi, I want to align the short reads to graph using vg giraffe and get a bam file, but i got the following error report. The VCF and graph (GFA file) were constructed using PGGB.

############################################# Guessing that pangenome.xg is XG MinimizerHeader: Expected v8 to 9, got v7 error[VPKG::load_one]: Correct input type not found in wheat-pangenome_chr14.dist while loading bdsg::SnarlDistanceIndex #############################################

Command: vg autoindex --workflow giraffe -r cs.PggbChrID.chr14.fasta -v combined.paf.417fcdf.42e55e5.smooth.final.cs.vcf -p pangenome_chr14 vg convert -t 60 -g combined.paf.417fcdf.42e55e5.smooth.final.gfa -v > pangenome_chr14.vg vg index pangenome_chr14.vg -L -x pangenome_chr14.xg vg index pangenome_chr14.vg -g pangenome_chr14.gcsa -k 16 vg snarls pangenome_chr14.giraffe.gbz > pangenome_chr14.snarls vg giraffe -t 60 -Z pangenome_chr14.giraffe.gbz -m pangenome_chr14.min -d pangenome_chr14.dist -f T1_1.fastp.fq.gz -f T1_2.fastp.fq.gz -o BAM > T1.mapped.bam

Looking forward to your reply. Thanks!

Ahahaha3 commented 11 months ago

What does running vg version say?

Lucio-Yang commented 11 months ago

My vg version is v1.51.0

vg version v1.51.0 "Quellenhof" Compiled with g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 on Linux Linked against libstd++ 20230528 Built by anovak@courtyard.gi.ucsc.edu

Ahahaha3 commented 11 months ago

maybe you can try version=1.38

jeizenga commented 11 months ago

You seem to have added some steps to the indexing pipeline that I can't see a purpose for. You don't need to create an XG, a snarls file, or a GCSA index to run vg giraffe. I also can't tell where the combined.paf.417fcdf.42e55e5.smooth.final.gfa file is coming from, seeing as you're using FASTA + VCF input for vg autoindex.

In any case, the error you're seeing makes it look like you have indexes from a mismatched VG version. Have you changed your VG version in between making the indexes and trying to use them? If so, you should be able to resolve this error by recreating the indexes with a consistent version.

Lucio-Yang commented 11 months ago

Thanks for your reply! I used the same version of vg (version=1.38) to run vg autoindex and vg giraffe again. But I got another error:

error:[vg::get_sequence_dictionary] No non-alt-allele paths available in the graph!

Command: vg autoindex --workflow giraffe -r cs.PggbChrID.chr14.fasta -v combined.paf.417fcdf.42e55e5.smooth.final.cs.vcf -p pangenome_chr14

vg giraffe -t 60 -Z pangenome_chr14.giraffe.gbz -m pangenome_chr14.min -d pangenome_chr14.dist -f T1_1.fastp.fq.gz -f T1_2.fastp.fq.gz -o BAM > T1.mapped.bam

jeizenga commented 10 months ago

My recommendation would be to update to the most recent vg version and try both steps again. The versions you are using are nearly 2 years old, and there have been numerous bugfixes since then.