vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.07k stars 191 forks source link

vg pack signal 6 error #4307

Open vetsuisse-unibe opened 3 weeks ago

vetsuisse-unibe commented 3 weeks ago

*1. I mapped long reads to the graph gbz.gfa using graphAligner and got the gam file. I wanted to to SV calling and genotyping

I ran the below command

apptainer exec cactus.sif vg pack -t 64 -x bd2.gbz -g BD016.hifi.gam -o BD016.pack also apptainer exec cactus.sif vg validate bd2.gbz BD016.hifi.gam gave the output

graph: valid

3. What actually happened? vg pack crashed and I got the following error.

vg: /public/home/anovak/build/vg/include/sdsl/int_vector.hpp:1391: sdsl::int_vector<<anonymous> >::reference sdsl::int_vector<<anonymous> >::operator[](const size_type&) [with unsigned char t_width = 0; sdsl::int_vector<<anonymous> >::reference = sdsl::int_vector_reference<sdsl::int_vector<0> >; sdsl::int_vector<<anonymous> >::size_type = long unsigned int]: Assertion `idx < this->size()' failed.

Crash report for vg v1.56.0 "Collalto"
Stack trace (most recent call last) in thread 1099295:
#14   Object "", at 0xffffffffffffffff, in 
#13   Object "/home/cactus/bin/vg", at 0x21c85ff, in __clone3
#12   Object "/home/cactus/bin/vg", at 0x2121afa, in start_thread
#11   Object "/home/cactus/bin/vg", at 0x20c438d, in gomp_thread_start
#10   Object "/home/cactus/bin/vg", at 0x20c6cd7, in gomp_team_barrier_wait_end
#9    Object "/home/cactus/bin/vg", at 0x20be5da, in gomp_barrier_handle_tasks
#8    Object"/home/cactus/bin/vg", at 0xdcddf5, in void vg::io::for_each_parallel_impl<vg::Alignment>(std::istream&, std::function<void (vg::Alignment&, vg::Alignment&)> const&, std::function<void (vg::Alignment&)> const&, std::function<bool ()> const&, unsigned long) [clone ._omp_fn.1]
#7    Object "/home/cactus/bin/vg", at 0x128753a, in vg::Packer::add(vg::Alignment const&, int, int, int)
#6    Object "/home/cactus/bin/vg", at 0x127ece8, in vg::Packer::increment_coverage(unsigned long)
#5    Object "/home/cactus/bin/vg", at 0x1279421, in sdsl::int_vector<(unsigned char)0>::operator[](unsigned long const&) [clone .isra.0]
#4    Object "/home/cactus/bin/vg", at 0x20f02f5, in __assert_fail
#3    Object "/home/cactus/bin/vg", at 0x5eae73, in __assert_fail_base.cold
#2    Object "/home/cactus/bin/vg", at 0x5eaf4b, in abort
#1    Object "/home/cactus/bin/vg", at 0x20f6905, in raise
#0    Object "/home/cactus/bin/vg", at 0x212345c, in __pthread_kill
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
Please include this entire error log in your bug report!.

6. What does running vg version say?

vg version v1.56.0 "Collalto"
Compiled with g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 on Linux
Linked against libstd++ 20230528
Built by anovak@courtyard.gi.ucsc.edu

Thank you very much for your help

glennhickey commented 3 weeks ago

You need to use -a with vg validate:

vg validate bd2.gbz -a BD016.hifi.gam 
vetsuisse-unibe commented 3 weeks ago

Thank you very much for your quick answer. When I run validate I get a lot of invalid Alignment

vg validate bd2.gbz -a BD016.hifi.gam 
Invalid Alignment:
{"identity": 1.0, "mapping_quality": 60, "name": "m84151_230927_104248_s2/258741483/ccs", "path": {"mapping": [{"edit": [{"from_length": 728, "to_length": 728}], "position": {"name": "17147998", "node_id": "17147998", "offset": "641"}}]}, "sequence": "AGAGTGAATTAAAGAATGTCTATTGTGACCTTAGAATATATTTAAGTAAAGAGTAAAATACATTTTCTGGAGGTACTAAGAAGTAGTAATGAATTGTGAAAAGGAAGATTATAAGATTTCTATAAGGAATTCATATATGAAGTCCCATTCTACTTGAGCAAAGAGCATGATAAAAAAATGCAATGTAAATGAGCACCATCAATAACATTAATTACCTGCTGCCTCTGCAGATGTGTATATTAGAGACAACAGGTTAGATGGGTTTTACTAAAGCACAGCAGACCGCTGAATCATATAGCCACAAAGGGCAAACTGTATCATGGAACCAGTTAGAGCTGACTTTATCACTGAGAAAGACATGCTACCAGATTAAAATGCAAATCATACAACTAATGCTGCATTGCTAGACAACACTATGTATAGAATAGGATAGATGCTGTATGTACAGAGAACCAAAATCTCCAGTCCATGCAATGCTCCATGCTTTTTGGAAATATATGCAAAGAATCATATGCTGATGGGTTACATGCTTAAATAACTACTCATTTTCTCATTTATACAACAAGACCCTGTAACTAACAGATTTTCTCAATACTTTGCAAGAATTTTGCTGACTTATTTGCTACGCAAATAAACCATTTTTATTAAATTCAATGCAATTACAAAGAGGTGATATGCTGTATTGATTTAGAAAGATACATCCAAGTTTAGTTAGTAAATACATTTGG"}
Length of node 17147998 (65) exceeded by Mapping with offset 641 and from-length 728:
{"edit": [{"from_length": 728, "to_length": 728}], "position": {"name": "17147998", "node_id": "17147998", "offset": "641"}}
Invalid Alignment:

I followed the tutorial here https://github.com/ComparativeGenomicsToolkit/cactus/blob/master/doc/sa_refgraph_hackathon_2023.md

and ran the following commands

apptainer exec  cactus.sif \
    cactus-pangenome ./js-bdchr ./seqfile.chr.txt \
    --outName bd2 \
    --outDir bd2 \
    --reference UU_Cfam_GSD_1 \
    --filter 2 \
    --giraffe clip filter \
    --vcf \
    --viz \
    --odgi \
    --chrom-vg clip filter \
    --chrom-og \
    --gbz clip filter full \
    --gfa clip full \
    --vcf \
    --logFile bd.log \
    --mgCores 64 \
    --mapCores 16 \
    --consCores 64 \

apptainer exec vg convert ./bd2/bd2.gbz -f > bd2.gbz.gfa
apptainer exec  graphaligner\:1.0.19--h21ec9f0_0 GraphAligner -g bd2.gbz.gfa -f BD016_flt.fastq -a BD016.hifi.gam -x vg -t 64

not sure how I should proceed. Thank you very much for any help.

jltsiren commented 3 weeks ago

The GFA you get from vg convert may not be the same graph as the GBZ. That is because GBZ is designed to both preserve the original GFA and expose a graph with long segments chopped to a more manageable size. If you use the GBZ graph, you see the graph with chopped nodes. But if you convert it to GFA, you get the original graph with potentially long segments.

If that is the issue, you should be able to fix it by adding option --no-translation to the vg convert command.