arq5x / lumpy-sv

lumpy: a general probabilistic framework for structural variant discovery
MIT License
314 stars 118 forks source link

lumpyexpress segmentation fault on b37 aligned tumor-normal pairs #259

Closed fpbarthel closed 6 years ago

fpbarthel commented 6 years ago

Hi,

I am running into previously reported segmentation fault errors using the lumpy-sv 0.2.14a install from Anaconda/bioconda.

Interestingly, some of the samples complete successfully and without errors. Re-running the samples that erred out initially several times replicates the segmentation fault error on every run.

The exact error is .../miniconda3/envs/lumpy-sv/bin/lumpyexpress: line 548: 10330 segmentation fault (core dumped)

There is plenty of space (>30 TB) in the temporary directory provided using -T and sufficient memory (72 GB) available.

Here is the output of ldd lumpy and there is no bamtools link, which was previously reported to be an issue here.

barthf@helix112$ ldd `which lumpy`
        linux-vdso.so.1 =>  (0x00007fffdedfa000)
        libz.so.1 => /projects/barthf/opt/miniconda3/envs/lumpy-sv/bin/../lib/libz.so.1 (0x00007f62d6dec000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003ad2c00000)  
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003ad3800000)
        libstdc++.so.6 => /projects/barthf/opt/miniconda3/envs/lumpy-sv/bin/../lib/libstdc++.so.6 (0x00007f62d6a98000)
        libm.so.6 => /lib64/libm.so.6 (0x00000033eb000000)
        libgcc_s.so.1 => /projects/barthf/opt/miniconda3/envs/lumpy-sv/bin/../lib/libgcc_s.so.1 (0x00007f62d6885000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003ad3000000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003ad2800000)

The input data is b37 aligned, and several extra contigs are included, as was proposed to be an issue here. Is there a b37 specific mask available that I can use to mask these contigs?

However, it is still surprising that some samples completed without issues despite having these contigs.

Any idea what the problem could be?

...
X       1000000
Y       1000000
Y       2000000
Y       4000000
MT      1000000
GL000207.1      1000000
GL000226.1      1000000
GL000229.1      1000000
GL000231.1      1000000
GL000210.1      1000000
GL000239.1      1000000
GL000235.1      1000000
GL000201.1      1000000
GL000247.1      1000000
GL000245.1      1000000
GL000197.1      1000000
GL000203.1      1000000
GL000246.1      1000000
GL000249.1      1000000
GL000196.1      1000000
GL000248.1      1000000
GL000244.1      1000000
GL000238.1      1000000
GL000202.1      1000000
GL000234.1      1000000
GL000232.1      1000000
GL000206.1      1000000
GL000240.1      1000000
GL000236.1      1000000
GL000241.1      1000000
GL000243.1      1000000
GL000242.1      1000000
GL000230.1      1000000
GL000237.1      1000000
GL000233.1      1000000
GL000204.1      1000000
GL000198.1      1000000
GL000208.1      1000000
GL000191.1      1000000
GL000227.1      1000000
GL000228.1      1000000
GL000214.1      1000000
GL000221.1      1000000
GL000209.1      1000000
GL000218.1      1000000
GL000220.1      1000000
GL000213.1      1000000
GL000211.1      1000000
GL000199.1      1000000
GL000217.1      1000000
GL000216.1      1000000
GL000215.1      1000000
GL000205.1      1000000
GL000219.1      1000000
GL000224.1      1000000
GL000223.1      1000000
GL000195.1      1000000
GL000212.1      1000000
GL000222.1      1000000
GL000200.1      1000000
GL000193.1      1000000
GL000194.1      1000000
GL000225.1      1000000
GL000192.1      1000000
NC_007605       1000000
hs37d5  1000000
fpbarthel commented 6 years ago

Using a b37-specific mask excluding all the extra contigs worked. Leaving this open for another minute because curious why this prevents segmentation fault? Is there any way to make the error message more descriptive? What if I was actually interested in the extra contigs?

ryanlayer commented 6 years ago

You are right. Lumpy needs better error messages.

I am guessing that it was a memory allocation error that the exclude file prevents by ignoring huge pile ups of alignments. Since these regions are highly repetitive, looking in those regions may require a different strategy that would give you all alignments, not just one.

On Aug 1, 2018, at 8:01 AM, fpbarthel notifications@github.com wrote:

Using a b37-specific mask excluding all the extra contigs worked. Leaving this open for another minute because curious why this prevents segmentation fault? Is there any way to make the error message more descriptive? What if I was actually interested in the extra contigs?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

fpbarthel commented 6 years ago

Thanks! I'll close this out