BilkentCompGen / valor

variation discovery using long range information in linked-reads
BSD 3-Clause "New" or "Revised" License
15 stars 2 forks source link

Segmentation faults when run in docker container #7

Closed cwhelan closed 6 years ago

cwhelan commented 6 years ago

Hi,

I'm trying to run Valor for inversion detection. I was able to complete a test run successfully on a sample BAM file on our local compute cluster. However, when I try to run in a docker container on a cloud VM using the same input file, I keep hitting segmentation faults. Here's the full output from one of my attempts:

Loading SONIC file..

SONIC Info: GRCh37 - 1000 Genomes Version
Built in Wed Apr 11 20:40:54 2018

Number of chromosomes: 84
Loading gap intervals.
Loading duplication intervals.
Loading repeats.
*** Error in `valor': corrupted size vs. prev_size: 0x000000008afa66c0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f4fdc41f7e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x7e913)[0x7f4fdc426913]
/lib/x86_64-linux-gnu/libc.so.6(+0x81cde)[0x7f4fdc429cde]
/lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x54)[0x7f4fdc42c184]
/lib/x86_64-linux-gnu/libc.so.6(qsort_r+0x79)[0x7f4fdc3e1349]
valor[0x403ad6]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f4fdc3c8830]
valor[0x404149]
======= Memory map: ========
00400000-0047d000 r-xp 00000000 08:01 550780                             /valor/valor
0067c000-0067d000 r--p 0007c000 08:01 550780                             /valor/valor
0067d000-0067e000 rw-p 0007d000 08:01 550780                             /valor/valor
00730000-94a5e000 rw-p 00000000 00:00 0                                  [heap]
7f4fc37d1000-7f4fc37d2000 ---p 00000000 00:00 0 
7f4fc37d2000-7f4fc3fd2000 rw-p 00000000 00:00 0 
7f4fc3fd2000-7f4fc3fd3000 ---p 00000000 00:00 0 
7f4fc3fd3000-7f4fc47d3000 rw-p 00000000 00:00 0 
7f4fc47d3000-7f4fc47d4000 ---p 00000000 00:00 0 
7f4fc47d4000-7f4fc4fd4000 rw-p 00000000 00:00 0 
7f4fcc000000-7f4fcc021000 rw-p 00000000 00:00 0 
7f4fcc021000-7f4fd0000000 ---p 00000000 00:00 0 
7f4fd2d7d000-7f4fd2d7e000 ---p 00000000 00:00 0 
7f4fd2d7e000-7f4fd357e000 rw-p 00000000 00:00 0 
7f4fd357e000-7f4fd357f000 ---p 00000000 00:00 0 
7f4fd357f000-7f4fd3d7f000 rw-p 00000000 00:00 0 
7f4fd3d7f000-7f4fd3d80000 ---p 00000000 00:00 0 
7f4fd3d80000-7f4fd4580000 rw-p 00000000 00:00 0 
7f4fd4580000-7f4fd4581000 ---p 00000000 00:00 0 
7f4fd4581000-7f4fd4d81000 rw-p 00000000 00:00 0 
7f4fd5401000-7f4fd5417000 r-xp 00000000 08:01 534294                     /lib/x86_64-linux-gnu/libgcc_s.so.1
7f4fd5417000-7f4fd5616000 ---p 00016000 08:01 534294                     /lib/x86_64-linux-gnu/libgcc_s.so.1
7f4fd5616000-7f4fd5617000 rw-p 00015000 08:01 534294                     /lib/x86_64-linux-gnu/libgcc_s.so.1
7f4fd5617000-7f4fdc1a4000 rw-p 00000000 00:00 0 
7f4fdc1a4000-7f4fdc1a7000 r-xp 00000000 08:01 534286                     /lib/x86_64-linux-gnu/libdl-2.23.so
7f4fdc1a7000-7f4fdc3a6000 ---p 00003000 08:01 534286                     /lib/x86_64-linux-gnu/libdl-2.23.so
7f4fdc3a6000-7f4fdc3a7000 r--p 00002000 08:01 534286                     /lib/x86_64-linux-gnu/libdl-2.23.so
7f4fdc3a7000-7f4fdc3a8000 rw-p 00003000 08:01 534286                     /lib/x86_64-linux-gnu/libdl-2.23.so
7f4fdc3a8000-7f4fdc568000 r-xp 00000000 08:01 534273                     /lib/x86_64-linux-gnu/libc-2.23.so
7f4fdc568000-7f4fdc768000 ---p 001c0000 08:01 534273                     /lib/x86_64-linux-gnu/libc-2.23.so
7f4fdc768000-7f4fdc76c000 r--p 001c0000 08:01 534273                     /lib/x86_64-linux-gnu/libc-2.23.so
7f4fdc76c000-7f4fdc76e000 rw-p 001c4000 08:01 534273                     /lib/x86_64-linux-gnu/libc-2.23.so
7f4fdc76e000-7f4fdc772000 rw-p 00000000 00:00 0 
7f4fdc772000-7f4fdc793000 r-xp 00000000 08:01 11668                      /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
7f4fdc793000-7f4fdc992000 ---p 00021000 08:01 11668                      /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
7f4fdc992000-7f4fdc993000 r--p 00020000 08:01 11668                      /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
7f4fdc993000-7f4fdc994000 rw-p 00021000 08:01 11668                      /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
7f4fdc994000-7f4fdc9ac000 r-xp 00000000 08:01 534341                     /lib/x86_64-linux-gnu/libpthread-2.23.so
7f4fdc9ac000-7f4fdcbab000 ---p 00018000 08:01 534341                     /lib/x86_64-linux-gnu/libpthread-2.23.so
7f4fdcbab000-7f4fdcbac000 r--p 00017000 08:01 534341                     /lib/x86_64-linux-gnu/libpthread-2.23.so
7f4fdcbac000-7f4fdcbad000 rw-p 00018000 08:01 534341                     /lib/x86_64-linux-gnu/libpthread-2.23.so
7f4fdcbad000-7f4fdcbb1000 rw-p 00000000 00:00 0 
7f4fdcbb1000-7f4fdccb9000 r-xp 00000000 08:01 534305                     /lib/x86_64-linux-gnu/libm-2.23.so
7f4fdccb9000-7f4fdceb8000 ---p 00108000 08:01 534305                     /lib/x86_64-linux-gnu/libm-2.23.so
7f4fdceb8000-7f4fdceb9000 r--p 00107000 08:01 534305                     /lib/x86_64-linux-gnu/libm-2.23.so
7f4fdceb9000-7f4fdceba000 rw-p 00108000 08:01 534305                     /lib/x86_64-linux-gnu/libm-2.23.so
7f4fdceba000-7f4fdced3000 r-xp 00000000 08:01 534372                     /lib/x86_64-linux-gnu/libz.so.1.2.8
7f4fdced3000-7f4fdd0d2000 ---p 00019000 08:01 534372                     /lib/x86_64-linux-gnu/libz.so.1.2.8
7f4fdd0d2000-7f4fdd0d3000 r--p 00018000 08:01 534372                     /lib/x86_64-linux-gnu/libz.so.1.2.8
7f4fdd0d3000-7f4fdd0d4000 rw-p 00019000 08:01 534372                     /lib/x86_64-linux-gnu/libz.so.1.2.8
7f4fdd0d4000-7f4fdd0fa000 r-xp 00000000 08:01 534253                     /lib/x86_64-linux-gnu/ld-2.23.so
7f4fdd10b000-7f4fdd2f5000 rw-p 00000000 00:00 0 
7f4fdd2f8000-7f4fdd2f9000 rw-p 00000000 00:00 0 
7f4fdd2f9000-7f4fdd2fa000 r--p 00025000 08:01 534253                     /lib/x86_64-linux-gnu/ld-2.23.so
7f4fdd2fa000-7f4fdd2fb000 rw-p 00026000 08:01 534253                     /lib/x86_64-linux-gnu/ld-2.23.so
7f4fdd2fb000-7f4fdd2fc000 rw-p 00000000 00:00 0 
7ffc787bd000-7ffc787de000 rw-p 00000000 00:00 0                          [stack]
7ffc787e2000-7ffc787e4000 r--p 00000000 00:00 0                          [vvar]
7ffc787e4000-7ffc787e6000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
/cromwell_root/script: line 23:    16 Aborted                 (core dumped) valor -i /cromwell_root/broad-gpc-wgs-10x-pilot/10x-longranger-output/10C110569/outs/phased_possorted_bam.bam -s /cromwell_root/cw-methods-dev/reference/sonic/human_g1k_v37.sonic -o 10C110569 -f INV -t 8

I had to slightly modify the distributed docker build to make it work with our WDL/cromwell execution enviroment and so that I could tag the image with a version number and not run into docker caching issues. Here's a copy of my docker file:

FROM ubuntu:16.04

RUN apt-get update -y --fix-missing
RUN apt-get install git make gcc zlib1g-dev -y

RUN mkdir /valor
WORKDIR /valor

RUN git clone https://github.com/BilkentCompGen/valor.git /valor && git checkout 89a664d1462d56f84238cbb7fa7141075ad59555 && git fetch --recurse-submodules
RUN make libs && make

ENV PATH="/valor:${PATH}"

In case it helps, here's some abridged output when running on the same bam file using our local compute:

valor/valor -i /seq/10x/longranger_data/2.1.0/inpsyght/10C110569/outs/phased_possorted_bam.bam -s human_g1k_v37.sonic -o 10C110569 -f INV -t 8

VALOR: Variation with LOng Range
Version: 2.0
Build Date: Fri Sep 21 13:36:07 EDT 2018
Output Directory: 10C110569
Logfile name: valor.log
Loading SONIC file..

SONIC Info: GRCh37 - 1000 Genomes Version
Built in Wed Apr 11 20:40:54 2018

Number of chromosomes: 84
Loading gap intervals.
Loading duplication intervals.
Loading repeats.
SONIC file loaded. Memory usage: 392.99 MB.
Reading Bam file: /seq/10x/longranger_data/2.1.0/inpsyght/10C110569/outs/phased_possorted_bam.bam

Finding Structural Variants in Chromosome 1
Recovering Split Molecules..
Sorting the DNA Intervals
Done
Recovering Molecules
Done
Global Molecule depth mean: 205.584818
Global Molecule Depth Standard Deviation: 86.213980
Molecule Count: 1292843 Molecule Mean: 18999.564059     Molecule std-dev: 2263.166451
Matching Split Molecules
8086 candidate variations are made
671 candidate variations are left after filtering
Finding Sv Clusters

Clustering is finished, found 1 variant clusters
Printing Variant calls

Finding Structural Variants in Chromosome 2
Recovering Split Molecules..
Sorting the DNA Intervals
Done
Recovering Molecules
Done
Global Molecule depth mean: 221.191628
Global Molecule Depth Standard Deviation: 50.647846
Molecule Count: 1380062 Molecule Mean: 18994.467117     Molecule std-dev: 1865.681670
Matching Split Molecules

4341 candidate variations are made
17 candidate variations are left after filtering
Finding Sv Clusters

Clustering is finished, found 1 variant clusters
Printing Variant calls

....

Finding Structural Variants in Chromosome 22
Recovering Split Molecules..
Sorting the DNA Intervals
Done
Recovering Molecules
Done
Global Molecule depth mean: 142.279142
Global Molecule Depth Standard Deviation: 71.139571
Molecule Count: 55424   Molecule Mean: 18642.012431     Molecule std-dev: 3709.821734
Matching Split Molecules
2667 candidate variations are made
12 candidate variations are left after filtering
Finding Sv Clusters

Clustering is finished, found 0 variant clusters
Printing Variant calls

Finding Structural Variants in Chromosome X
Recovering Split Molecules..
Sorting the DNA Intervals
Done
Recovering Molecules
Done
Global Molecule depth mean: 112.841685
Global Molecule Depth Standard Deviation: 56.420842
Molecule Count: 226287  Molecule Mean: 19483.331592     Molecule std-dev: 3883.561550
Matching Split Molecules
163 candidate variations are made
8 candidate variations are left after filtering
Finding Sv Clusters

Clustering is finished, found 0 variant clusters
Printing Variant calls

Finding Structural Variants in Chromosome Y
Recovering Split Molecules..
Sorting the DNA Intervals
Done
Recovering Molecules
Done
Global Molecule depth mean: 29.738544
Global Molecule Depth Standard Deviation: 14.869272
Molecule Count: 8232    Molecule Mean: 21464.246720     Molecule std-dev: 8799.662295
Matching Split Molecules
533 candidate variations are made
0 candidate variations are left after filtering
Finding Sv Clusters

Any thoughts on what might be going wrong or how to debug? Thanks!

calkan commented 6 years ago

Thanks Chris. We are aware of some docker problems and we are currently working on them and also try to install it to SevenBridges platform to validate its use on cloud VMs. We'll get it working soon, hopefully.

calkan commented 6 years ago

Hi @cwhelan, can you try again with the latest source code? I have it running on my docker environment now, but didn't test on AWS yet. We also have a prebuilt image: https://hub.docker.com/r/alkanlab/valor/

cwhelan commented 6 years ago

Hi @calkan, I was able to update to the latest commit (51dafc5a194d899d79ef8d3afd6bbe20173b8d06) and things seem to be running successfully now. Thanks!