vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.11k stars 195 forks source link

ERROR: Signal 6 has occurred. VG has crashed. #3606

Closed sahasra-shankar closed 2 years ago

sahasra-shankar commented 2 years ago

1. What were you trying to do? Run vg construct on the reference and the variants.

2. What did you want to happen? Successfully build the graph.vg

3. What actually happened?

<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
vg: src/constructor.cpp:2213: void vg::Constructor::construct_graph(const std::vector<FastaReference*>&, const std::vector<vcflib::VariantCallFile*>&, const std::vector<FastaReference*>&, const std::function<void(vg::Graph&)>&): Assertion `reference_for.count(fasta_contig)' failed.
ERROR: Signal 6 occurred. VG has crashed.

4. If you got a line like Stack trace path: /somewhere/on/your/computer/stacktrace.txt, please copy-paste the contents of that file here:

Crash report for vg not-from-git
Stack trace (most recent call last):
#10   Object "/vg/bin/vg", at 0x5a7a2d, in _start
#9    Object "/vg/bin/vg", at 0x1b9a1af, in __libc_start_main
#8    Object "/vg/bin/vg", at 0x57d364, in main
#7    Object "/vg/bin/vg", at 0xbb85cb, in vg::subcommand::Subcommand::operator()(int, char**) const
#6    Object "/vg/bin/vg", at 0xc03cd6, in main_construct(int, char**)
#5    Object "/vg/bin/vg", at 0xd28c01, in vg::Constructor::construct_graph(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::function<void (vg::Graph&)> const&)
#4    Object "/vg/bin/vg", at 0xd27fe1, in vg::Constructor::construct_graph(std::vector<FastaReference*, std::allocator<FastaReference*> > const&, std::vector<vcflib::VariantCallFile*, std::allocator<vcflib::VariantCallFile*> > const&, std::vector<FastaReference*, std::allocator<FastaReference*> > const&, std::function<void (vg::Graph&)> const&)
#3    Object "/vg/bin/vg", at 0x1baa485, in __assert_fail
#2    Object "/vg/bin/vg", at 0x57c763, in __assert_fail_base.cold
#1    Object "/vg/bin/vg", at 0x57c893, in abort
#0    Object "/vg/bin/vg", at 0x12a4eab, in raise

5. What data and command can the vg dev team use to make the problem happen?

vg construct -r ./ssdata/Zaire_Ebola_genomic.fna -v ./ssdata/SNP-2014.vcf.gz > Zaire_Graph.vg

6. What does running vg version say?

<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
vg version not-from-git
Compiled with g++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 on Linux
Linked against libstd++ 20200808
Built by root@953ed69611c9
jeizenga commented 2 years ago

It could be that the contig names don't match between the VCF and the FASTA. Can you check to make sure that they do?

sahasra-shankar commented 2 years ago

How would I go about checking that?

jeizenga commented 2 years ago

The VCF contigs should be listed in the header of the file. You can see them with this command:

bcftools view -h variants.vcf.gz | grep contig

The FASTA contigs are the sequence names, which you can pull out with this command:

grep ">" ref.fasta
sahasra-shankar commented 2 years ago

I was not able to check contigs with the command you suggested because I don't seem to have bcftools, however I was able to take a look at the example VCF file in vg and realized that my VCF file does not look exactly like that (missing list of "##contig=..."). Is there a way to edit my current VCF so that it matches the example? Or are there resources to get a VCF that looks like the example for the virus I am interested in. I have attached my VCF below: SNP-2014.vcf.gz

jeizenga commented 2 years ago

I would strongly recommend installing bcftools if you plan to work with VCF files -- it's really an essential tool in bioinformatics.

That said, if you're header doesn't have contig lines (they are optional, so it's not strictly speaking an error), then you could retrieve the contig sequences with this command (assuming the variants are grouped by contig):

zcat variants.vcf.gz | grep -v "#" | cut -f 1 | uniq
sahasra-shankar commented 2 years ago

I see, I fixed the problem but also was wondering if there is a way to find VCF files online that uses other virus variants as a reference as I only found a VCF using one specific variant as the reference. Because there are two variants I am interested in, I am hoping to find a VCF using the other variants as a reference as well. Are there resources that have VCF files of various viral variants, or would I have to generate my own VCF file? In the case that I have to create my own, is there a tool I could use to do this?

jeizenga commented 2 years ago

I'm afraid I can't be much help there. I'm only particularly familiar with the variant data resources for humans. I'm going to close this issue since the VG error has been resolved.

netwon123 commented 1 year ago

i also meet this problem, could u tell me how to solve it? Thainks