vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.1k stars 194 forks source link

Problem with vg autoindex with phased VCF #4219

Closed Mirkocoggi closed 8 months ago

Mirkocoggi commented 8 months ago

Hi, I'm trying to use vg autoindexwith the human Chromosome 17 using:

I'm using the 1.54.00 vg release "Parafada", running the command:

vg autoindex --workflow giraffe -r Homo_sapiens.GRCh37.dna.chromosome.17.fa -v ALL.chr17.phase3_shapeit2_mvncall_integrated_v3plus_nounphased.rsID.genotypes.vcf.gz -p x -V 2

The error I get is:

cannot setRegion on a non-tabix indexed file

And I think this happens inside the function HaplotypeIndexer::parse_vcf. My question here is double:

  1. Is there a way/workflow to autoindex these files? For example, to obtain the tabix file for these inputs
  2. Is there the possibility of creating the GBWT without passing through a XG graph but only using the FASTA and the VCF?
Mirkocoggi commented 8 months ago

I tried to put the .tbi file in the command line without flags but I still got:

cannot setRegion on a non-tabix indexed file

I also tried to put a the .tbi file after another -v flag, but I got this error:

[vg autoindex] Executing command: vg autoindex --workflow giraffe -r Homo_sapiens.GRCh38.dna.chromosome.17.fa -v ALL.chr17.phase3_shapeit2_mvncall_integrated_v3plus_nounphased.rsID.genotypes.vcf -v ALL.chr17.phase3_shapeit2_mvncall_integrated_v3plus_nounphased.rsID.genotypes.vcf.gz.tbi -p x -V 2 [IndexRegistry]: Checking for phasing in VCF(s). [IndexRegistry]: Provided: VCF w/ Phasing [IndexRegistry]: Provided: VCF w/ Phasing [IndexRegistry]: Chunking inputs for parallelism. [E::hts_hopen] Failed to open file ALL.chr17.phase3_shapeit2_mvncall_integrated_v3plus_nounphased.rsID.genotypes.vcf.gz.tbi [E::hts_open_format] Failed to open file "ALL.chr17.phase3_shapeit2_mvncall_integrated_v3plus_nounphased.rsID.genotypes.vcf.gz.tbi" : Exec format error ━━━━━━━━━━━━━━━━━━━━ Crash report for vg v1.54.0 "Parafada" Stack trace (most recent call last) in thread 1649822:

5 Object "", at 0xffffffffffffffff, in

4 Object "/usr/lib/x86_64-linux-gnu/libc-2.31.so", at 0x7fbdafd8e352, in __clone

  Source "../sysdeps/unix/sysv/linux/x86_64/clone.S", line 95, in __clone [0x7fbdafd8e352]

3 Object "/usr/lib/x86_64-linux-gnu/libpthread-2.31.so", at 0x7fbdb04ea608, in start_thread

  Source "/build/glibc-wuryBv/glibc-2.31/nptl/pthread_create.c", line 477, in start_thread [0x7fbdb04ea608]

2 Object "/usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0", at 0x7fbdafe9886d, in

1 Object "/home/users/mirko.coggi/vg/bin/vg", at 0x55652eb5fd26, in vg::VGIndexes::get_vg_index_registry()::{lambda(std::vector<vg::IndexFile const, std::allocator<vg::IndexFile const> > const&, vg::IndexingPlan const, vg::AliasGraph&, std::set<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::less<std::cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > const&, bool, bool)#4}::operator()(std::vector<vg::IndexFile const, std::allocator<vg::IndexFile const> > const&, vg::IndexingPlan const, vg::AliasGraph&, std::set<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::less<std::cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > const&, bool, bool) const [clone ._omp_fn.0]

  Source "src/index_registry.cpp", line 654, in _ZZN2vg9VGIndexes21get_vg_index_registryEvENKUlRKSt6vectorIPKNS_9IndexFileESaIS4_EEPKNS_12IndexingPlanERNS_10AliasGraphERKSt3setINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4lessISK_ESaISK_EEbbE2_clES8_SB_SD_SQ_bb._omp_fn.0 [0x55652eb5fd26]

0 Object "/home/users/mirko.coggi/vg/bin/vg", at 0x55652f2ec0a8, in bcf_hdr_read

ERROR: Signal 11 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug. Please include this entire error log in your bug report!

jeizenga commented 8 months ago

This has been resolved on Biostars: https://www.biostars.org/p/9585982/#9586003