vg call crash with "ERROR: Signal 6 occurred. VG has crashed."

vrenaul1 commented 3 months ago

1. What were you trying to do?

Simple variant calling on a small fastq file

2. What did you want to happen?

Have a vcf file generated

3. What actually happened?

vg call returned an error message as shown below

4. If you got a line like Stack trace path: /somewhere/on/your/computer/stacktrace.txt, please copy-paste the contents of that file here:

NFO:    gocryptfs not found, will not be able to use gocryptfs
vg [warning]: System's vm.overcommit_memory setting is 2 (never overcommit). vg does not work well under these conditions; you may appear to run out of memory with plenty of memory left. Attempting to unsafely reconfigure jemalloc to deal better with this situation.
terminate called after throwing an instance of 'St9bad_alloc'
  what():  std::bad_alloc
terminate called recursively
━━━━━━━━━━━━━━━━━━━━
Crash report for vg v1.55.0 "Bernol
terminate called recursively
━━━━━━━━━━━━━━━━━━━━
Crash report for vg ━━━━━━━━━━━━━━━━━━━━
Crash report for vg v1.55.0 "Bernolda"
Stack trace (most recent call last) in thread 1990508:
terminate called recursively
━━━━━━━━━━━━━━━━━━━━
Crash report for vg v1.55.0 "Bernol
Stack trace (most recent call last) in thread 1990527:
v1.55.0 "Bernol
Stack trace (most recent call last) in thread 1990529:
#12   Object "", at 0xffffffffffffffff, in
#11   Object "", at 0x1fcbc72, in
#10   Object "", at 0x14c8c88, in
#9    Object "", at 0x171a0f8, in
#8    Object "", at 0xe4f930, in
#7    Object "", at 0x1bb0e9d, in
#6    Object "", at 0x5c1fd5, in
#5    Object "", at 0x1e4f698, in
#4    Object "", at 0x1e4f536, in
#3    Object "", at 0x1e4f4cb, in
#2    Object "", at 0x5c030a, in
#1    Object "", at 0x5c2ca7, in
#0    Object "", at 0x14cc50b, in
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
Please include this entire error log in your bug report!
━━━━━━━━━━━━━━━━━━━━

5. What data and command can the vg dev team use to make the problem happen?

The following command was run:

singularity run ~/vg_v1.55.0.sif vg call -r ~/gg/hprc-v1.1-mc-grch38.snarls -s A -S GRCh38#0#chr1 -k ~/NA12878_gg/A.pack /mnt/beegfs/home/vrenaul1/gg/hprc-v1.1-mc-grch38.gbz > /mnt/beegfs/home/vrenaul1/NA12878_gg/A.vcf

The fastq file used is NA12878.chr22.sample.fq.gz extracted from https://zenodo.org/records/10794014/files/data.tar.gz?download=1.

6. What does running vg version say?

INFO:    gocryptfs not found, will not be able to use gocryptfs
vg [warning]: System's vm.overcommit_memory setting is 2 (never overcommit). vg does not work well under these conditions; you may appear to run out of memory with plenty of memory left. Attempting to unsafely reconfigure jemalloc to deal better with this situation.
vg version v1.55.0 "Bernolda"
Compiled with g++ (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 on Linux
Linked against libstd++ 20210601
Built by root@buildkitsandbox

jeizenga commented 2 months ago

@adamnovak would know better, but I'm pretty sure the error is related to the warning on the second line. A bad_alloc error typically indicates that you were out of memory, which is exactly what that warning says might happen. As I understand it, the memory allocator that is built into vg really can't work well if /proc/sys/vm/overcommit_memory is set to 2. I would recommend talking to the admin of your server about changing the vm.overcommit_memory setting.

If that's not possible, you won't be able to use the prebuilt releases, but it's possible to compile vg manually with a different memory allocator. You can follow the standard build instructions, but when it's time to run make, you add an extra argument like this:

make -j [nthreads] jemalloc=off

You should only do this if you really can't change the vm.overcommit_memory setting though. Switching out the memory allocator will slow down vg substantially.

adamnovak commented 2 months ago

I think @jeizenga is right here.

We have Docker images available of the snapshotted build result that you can use to just re-link without jemalloc, as part of a container build process.

With Singularity 3.11, I think you should be able to get a usable SIF file with:

cat >vg-nojemalloc.def <<'EOF'
Bootstrap: docker
From: quay.io/vgteam/vg:cache-v1.55.0-build

%post
    cd /vg && make jemalloc=off
EOF
singularity build vg-nojemalloc.sif vg-nojemalloc.def

On older versions of Singularity you need root access or the administrator needs to have set up --fakeroot to work to let you build containers.

Or if you can build Dockerfiles I think you could build with Docker like this:

cat >vg-nojemalloc.dockerfile <<'EOF'
FROM quay.io/vgteam/vg:cache-v1.55.0-build
RUN make jemalloc=off
EOF
docker build . -f vg-nojemalloc.dockerfile -t quay.io/vgteam/vg:v1.55.0-nojemalloc

adamnovak commented 2 months ago

@vrenaul1 I pushed up the container built from this as quay.io/adamnovak/vg:v1.55.0-nojemalloc. Maybe try:

singularity run docker://quay.io/adamnovak/vg:v1.55.0-nojemalloc vg call -r ~/gg/hprc-v1.1-mc-grch38.snarls -s A -S GRCh38#0#chr1 -k ~/NA12878_gg/A.pack /mnt/beegfs/home/vrenaul1/gg/hprc-v1.1-mc-grch38.gbz > /mnt/beegfs/home/vrenaul1/NA12878_gg/A.vcf

vrenaul1 commented 2 months ago

Thanks @adamnovak @jeizenga for your input ! Unfortunately, the value in /proc/sys/vm/overcommit_memory can not be changed on the calculation cluster I am using. I was able though to build a singularity image by turning off jemalloc by following your recommandations @adamnovak. I am running the "vg call" command on a calculation cluster with the new singularity image. In parallel, I am running the same command on a local machine on which I can change the value of /proc/sys/vm/overcommit_memory (set to 0). In both instances, the running time is abnormally long ie it's been running for over 12 hours so far even though the input file is pretty small. The vcf file is still empty, the log file does not show any progress. I am not sure where my command below is potentially wrong: '''singularity run ~/graphGenomesTools_1.0.sif vg call -t 32 -r ~/graphGenomes/hprc-v1.1-mc-grch38.snarls -s A -k ~/NA12878.chr22/A.pack -S GRCh38 ~/graphGenomes/hprc-v1.1-mc-grch38.gbz > ~/NA12878.chr22/A.vcf'''.

Any ideas on how I can improve the running time ? Thanks for your help !

glennhickey commented 2 months ago

Usually adding -z to vg call in order to restrict the search to haplotypes in the GBZ will speed it up considerably.

vrenaul1 commented 2 months ago

Thanks @glennhickey for your feeback. My main purpose in using graph mapping / calling tools is to be able to detect delins that are difficult to detect using conventional NGS tools. If I understand correctly, adding "-z" will indeed speed up a lot (indeed the process finished in minutes) but I may misss these delins events if they do not belong to any haplotypes in the GBZ if I understand correctly ? I am working on cancer data so there are probably a lot of such events

glennhickey commented 2 months ago

If you don't want to restrict to the haplotypes, you can try -C to limit the maximum allele size. Note: this option got fixed up in the latest vg release (v1.56.0), so make sure to be using that one.

vrenaul1 commented 2 months ago

Thanks @glennhickey for the information ! It does indeed speed up significantly the analysis.

vgteam / vg

vg call crash with "ERROR: Signal 6 occurred. VG has crashed." #4260