PacificBiosciences / FALCON-integrate

Mostly deprecated. See https://github.com/PacificBiosciences/FALCON_unzip/wiki/Binaries
https://github.com/PacificBiosciences/FALCON/wiki/Manual
31 stars 28 forks source link

LA4Falcon segmentation fault #143

Open AGI-chandler opened 7 years ago

AGI-chandler commented 7 years ago

We are experiencing a seg fault with LA4Falcon on our CentOS 6.9 server, using Falcon from pitchfork.

The fault is happening in the 0-rawreads/preads/cns_00170 task:

$ cat c_00170.sh 
#!/bin/bash
set -vex
set -o pipefail
CUTOFF=$(python2.7 -m falcon_kit.mains.calc_cutoff --coverage 40.0 800000000 <(DBstats -b1 /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/raw_reads.db))
LA4Falcon -H$CUTOFF -fo  /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/raw_reads.db /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/m_00170/raw_reads.170.las | fc_consensus --output_multi --min_idt 0.70 --min_cov 4 --max_n_read 200 --n_core 1 >| cns_00170.fasta
touch out.done$ 

A core.17702 dump was produced and here is what gdb tells me:

$ gdb /opt/pacbio/pitchfork/deployment/bin/LA4Falcon -c 0-rawreads/preads/cns_00170/core.17702 
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-92.el6)
This GDB was configured as "x86_64-redhat-linux-gnu".
Reading symbols from /opt/pacbio/pitchfork/deployment/bin/LA4Falcon...(no debugging symbols found)...done.
[New Thread 17702]
Reading symbols from /lib64/libm.so.6...Reading symbols from /usr/lib/debug/lib64/libm-2.12.so.debug...done.
done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libpthread.so.0...Reading symbols from /usr/lib/debug/lib64/libpthread-2.12.so.debug...done.
[Thread debugging using libthread_db enabled]
done.
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libc.so.6...Reading symbols from /usr/lib/debug/lib64/libc-2.12.so.debug...done.
done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib64/ld-2.12.so.debug...done.
done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `LA4Falcon -H24386 -fo /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000401faa in main ()
(gdb) bt full
#0  0x0000000000401faa in main ()
No symbol table info available.
(gdb) q
$

I notice the line Core was generated by 'LA4Falcon -H24386 -fo /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads'. which isn't the full command (missing .las file as last argument).

The cutoff value produced here is 24386. So i tried running the command manaully (with more cores so it goes faster, same thing happens with 2,4,8,16 cores...): $ LA4Falcon -H24386 -fo /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/raw_reads.db /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/m_00170/raw_reads.170.las | fc_consensus --output_multi --min_idt 0.70 --min_cov 4 --max_n_read 200 --n_core 40 >| cns_00170.fasta

It runs for a little then crashes again.

LA4Falcon, python2.7, and fc_consensus are all from /opt/pacbio/pitchfork/deployment/bin

Jianwei-Zhang commented 7 years ago

Addition to the above, only 2 out of 218 LA4Falcon jobs failed with "segmentation fault".

pb-cdunn commented 7 years ago

which isn't the full command (missing .las file as last argument)

That's nothing. It's truncated.

/newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/.raw_reads.*

What are the sizes of those files? If not too big, maybe you can get that to me, plus the .las file. If I can repro, I can fix it.

Otherwise, you need to build a debug version of LA4Falcon and get a stack-trace.

In the meantime, try:

LAshow -a /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/raw_reads.db /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/m_00170/raw_reads.170.las

That might give you a hint. Also try LAdump. Both should be in your PATH from pitchfork. If that fails, what is the filesize of the .las file? It might be empty/corrupt.

AGI-chandler commented 7 years ago
# du -shc /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/.raw_reads.*
23G /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/.raw_reads.bps
227M    /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/.raw_reads.idx
23G total
# du -hsc /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/m_00170/raw_reads.170.las
3.9G    /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/m_00170/raw_reads.170.las
3.9G    total
#

We could compress it and put it on our FTP if you want.

When I run LAshow and LAdump it spits out a bunch of data... i let it run for a little and it fills up the output file, here is the beginning. Should I let it finish? How could I enable debugging for LA4Falcon?

$ LAshow -a /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/raw_reads.db /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/m_00170/raw_reads.170.las > LAshow.out
^C
$ head LAshow.out 

raw_reads.170: 96,777,976 records

 4,042,715          87 n   [ 18,686.. 20,679] x [    782..  2,767]  ~  32.4%   (    644 diffs,   3 trace pts)

      18676 ggcgaaccga[atgccaa-cacgaggagcc-tacttcaa-ccatatg-ggactgttgttgagac-cta-caacaa-tc-gtgc-ataa-a--accacaaa
            ::::::::::[|||||||*||*||||||||*|*||||||*||||||**|||||*||*|||*||**|*|*||||||*||*|*||*|*||*|**||*|||||
        772 gaacctgtgg[atgccaagcatgaggagccctgcttcaaaccatataaggact-tt-ttg-gaagcaaacaacaaatctgggcgagaagatgac-acaaa  25.8%

      18763 t-cagaggct-g--cacacagttcactgtctcatggtgagagtgccatagaga-t-cattcttcaaa-cagccagt-taatggaaataatgccagt-ctg

$ LAdump /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/raw_reads.db /newwing/wing2/users/jzhang/work/coarctata/1st/0-rawreads/m_00170/raw_reads.170.las > LAdump.out
^C
$ head LAdump.out 
+ P 96777976
% P 1756364
+ T -17881080212
% T 130744320
@ T 130744320
P 4042715 87 n .
P 4042715 363 n .
P 4042715 644 n .
P 4042715 936 n .
P 4042715 947 n .
$ 

Thanks

pb-cdunn commented 7 years ago

How could I enable debugging for LA4Falcon?

In DALIGNER repository directory, rm -f LA4Falcon; CFLAGS=-g make. I think. You should run something like readlink -f $(which LA4Falcon) to convince yourself that you are using the rebuild version.