walaj / svaba

Structural variation and indel detection by local assembly
GNU General Public License v3.0
235 stars 45 forks source link

svaba segmentation fault (core dumped) #135

Closed yungruifei closed 6 months ago

yungruifei commented 8 months ago

Hi, When I used svaba to run in germline mode, with the following command: ~/tools/svaba run -t ../../../2.bam -k k.bed -a RT -L 6 -I -p 10 -G ssc11_1.fa

the log file reminds: Running region 1:171,501-196,501(*) on thread 47364435871488

but it stopped and reminds: svaba segmentation fault (core dumped)...

The version I used was conda (1.1) and the most recent released (1.2), it may not be the bug relating to version. The bam file is generated by gtx (FPGA based) with header (sample name: @RG ID:test_pig SM:test_pig), and k.bed is the target region, ssc11_1.fa is the reference genome of pig (V11.1, sus scrofa)

I do not know why it failed to run, so I need help for it.

Best regards! Yung

JFanbio commented 8 months ago

same Segmentation fault (core dumped) here.

walaj commented 8 months ago

Hmm OK - can you confirm still an issue with latest github verison? Would also be helpful to try with -p 1, then see the region it fails on and then confirm it crashes if you run just that region (using the -k flag). That will help create a minimal working example.

On Thu, Mar 21, 2024 at 4:28 AM JFanbio @.***> wrote:

same Segmentation fault (core dumped) here.

— Reply to this email directly, view it on GitHub https://github.com/walaj/svaba/issues/135#issuecomment-2011623572, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUZ7CB3SQ7QZK4LEFYX3VTYZKKZXAVCNFSM6AAAAABEDDWO7WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJRGYZDGNJXGI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

yungruifei commented 8 months ago

Hmm OK - can you confirm still an issue with latest github verison? Would also be helpful to try with -p 1, then see the region it fails on and then confirm it crashes if you run just that region (using the -k flag). That will help create a minimal working example. On Thu, Mar 21, 2024 at 4:28 AM JFanbio @.> wrote: same Segmentation fault (core dumped) here. — Reply to this email directly, view it on GitHub <#135 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABUZ7CB3SQ7QZK4LEFYX3VTYZKKZXAVCNFSM6AAAAABEDDWO7WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJRGYZDGNJXGI> . You are receiving this because you are subscribed to this thread.Message ID: @.>

Thank you, Wala. I have done for the error, it succeed to run if I use just that region using samtools to obtain the respecting region :)

JFanbio commented 8 months ago

Hmm OK - can you confirm still an issue with latest github verison? Would also be helpful to try with -p 1, then see the region it fails on and then confirm it crashes if you run just that region (using the -k flag). That will help create a minimal working example. On Thu, Mar 21, 2024 at 4:28 AM JFanbio @.> wrote: same Segmentation fault (core dumped) here. — Reply to this email directly, view it on GitHub <#135 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUZ7CB3SQ7QZK4LEFYX3VTYZKKZXAVCNFSM6AAAAABEDDWO7WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJRGYZDGNJXGI . You are receiving this because you are subscribed to this thread.Message ID: @.>

Hi there is no region info, like below:

svaba run -I -L 6 -p 1 -G  $fa -t /home/jianfan/sd1/AA_mouse/01AA/${samplename}.cs.rmdup.bam -a ${samplename}-SVABA
-----------------------------------------------------------
---  Running svaba SV and indel detection on 1 threads ----
---    (inspect *.log for real-time progress updates)   ---
-----------------------------------------------------------
[M::bwa_idx_load_from_disk] read 0 ALT contigs
--- Loaded non-read data. Starting detection pipeline
Segmentation fault (core dumped)

Best, Jian

walaj commented 7 months ago

@JFanbio - the region it was last running comes from inspecting the end of the log file.

I updated to have a much more verbose log file that may help identify the section this crashed on. Would either of you be able to re-clone/build (recursive, to get updted SeqLib) and retry and send me the last 20 lines or so of the log file?

xaviloinaz commented 7 months ago

Hi Jeremiah,

I'm running into seemingly the same issue as JFanbio above.

xloinaz@xavi-instance-6:~/testing_out_jeremiah_updated_svaba$ ../svaba/build/svaba run -p 1 -t input_bams/TCGA-A5-A2K4-01A-11D-A849-36.WholeGenome.RP-1657.tumor.downsampled_point01pct.bam -n input_bams/TCGA-A5-A2K4-01A-11D-A849-36.WholeGenome.RP-1657.tumor.downsampled_point01pct.bam -G reference_files/GRCh38.d1.vd1.fa -a 'TCGA-A5-A2K4' -D reference_files/Homo_sapiens_assembly38.dbsnp138.indels_only.recode.vcf -k 1
-----------------------------------------------------------
---  Running svaba SV and indel detection on 1 threads ----
---    (inspect *.log for real-time progress updates)   ---
-----------------------------------------------------------
[M::bwa_idx_load_from_disk] read 0 ALT contigs
--- Loaded non-read data. Starting detection pipeline
Segmentation fault (core dumped)

(I set -k to 1 to specifically focus on the region it fails in which is chromosome 1.)

If I tail the last 20 lines of the log file I get the following:

Set ref and alt tags  chr1:121,740,501-121,765,501(*)
Ran chr1:121,740,501-121,765,501 | T:     2 N:     2 C:     0 | R: 12% M:  0% T:  0% C:  0% A: 86% P:  0% | CPU:    0m30s Wall:    0m09s
Running region chr1:121,765,001-121,790,001(*) on thread 140262271035136
Creating a new BFC error corrector on region chr1:121,765,001-121,790,001(*)
Starting timer chr1:121,765,001-121,790,001(*)
Setting region -- Getting reads from BAM with prefix n001 on region chr1:121,765,001-121,790,001(*)
Concatenating reads from BAM with prefix n001 on region chr1:121,765,001-121,790,001(*)
Merging Overlapping Intervals from BAM with prefix n001 on region chr1:121,765,001-121,790,001(*)
Creating tree map from BAM with prefix n001 on region chr1:121,765,001-121,790,001(*)
Setting region -- Getting reads from BAM with prefix t000 on region chr1:121,765,001-121,790,001(*)
Concatenating reads from BAM with prefix t000 on region chr1:121,765,001-121,790,001(*)
Merging Overlapping Intervals from BAM with prefix t000 on region chr1:121,765,001-121,790,001(*)
Creating tree map from BAM with prefix t000 on region chr1:121,765,001-121,790,001(*)
Collecting cigar strings on region chr1:121,765,001-121,790,001(*)
Collecting and clearing reads chr1:121,765,001-121,790,001(*)
Running mate collection loops chr1:121,765,001-121,790,001(*)
        Total of 0 bad mate regions for this thread
Removing hardclips chr1:121,765,001-121,790,001(*)
Doing kmer correction chr1:121,765,001-121,790,001(*)
BFC Train chr1:121,765,001-121,790,001(*)

Thanks, Xavi

jalberge commented 7 months ago

Confirming compiling commit 859b9a5 leads to Segmentation fault (core dumped) error on Ubuntu 20.04 but not on macOS (clang 12). Here are Ubuntu's valgrind and gdb logs (may not have compiled in full debug mode but gives an idea of where the bug occurs):

valgrind --leak-check=full svaba run -p 1 -k chr22 -t /mnt/nfs/workspace/chr22/t.chr22.bam -n /mnt/nfs/workspace/chr22/n.chr22.bam -G /mnt/nfs/workspace/ref/Homo_sapiens_assembly38.fasta

==15126== Thread 3:
==15126== Invalid read of size 8
==15126==    at 0x2CEFDF: get_subhash (htab.c:51)
==15126==    by 0x2CEFDF: bfc_ch_insert (htab.c:66)
==15126==    by 0x2CFEAD: bfc_kmer_insert (bfc.c:57)
==15126==    by 0x2CFEAD: worker_count (bfc.c:79)
==15126==    by 0x2C9989: ktf_worker (kthread.c:42)
==15126==    by 0x4914608: start_thread (pthread_create.c:477)
==15126==    by 0x4D9C352: clone (clone.S:95)
==15126==  Address 0x10000038588038 is not stack'd, malloc'd or (recently) free'd
==15126==
==15126==
==15126== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==15126==  General Protection Fault
==15126==    at 0x2CEFDF: get_subhash (htab.c:51)
==15126==    by 0x2CEFDF: bfc_ch_insert (htab.c:66)
==15126==    by 0x2CFEAD: bfc_kmer_insert (bfc.c:57)
==15126==    by 0x2CFEAD: worker_count (bfc.c:79)
==15126==    by 0x2C9989: ktf_worker (kthread.c:42)
==15126==    by 0x4914608: start_thread (pthread_create.c:477)
==15126==    by 0x4D9C352: clone (clone.S:95)
==15126==
==15126== HEAP SUMMARY:
==15126==     in use at exit: 5,700,886,690 bytes in 1,233,003 blocks
==15126==   total heap usage: 16,526,896 allocs, 15,293,893 frees, 8,692,135,753 bytes allocated
==15126==
==15126== Thread 1:
==15126== 288 bytes in 1 blocks are possibly lost in loss record 202 of 375
==15126==    at 0x4844277: calloc (vg_replace_malloc.c:1675)
==15126==    by 0x40149DA: allocate_dtv (dl-tls.c:286)
==15126==    by 0x40149DA: _dl_allocate_tls (dl-tls.c:532)
==15126==    by 0x4915322: allocate_stack (allocatestack.c:622)
==15126==    by 0x4915322: pthread_create@@GLIBC_2.2.5 (pthread_create.c:660)
==15126==    by 0x136BF4: sendThreads(SeqLib::GenomicRegionCollection<SeqLib::GenomicRegion>&) (in /opt/svaba/build/svaba)
==15126==    by 0x139927: runsvaba(int, char**) (in /opt/svaba/build/svaba)
==15126==    by 0x12F4BC: main (in /opt/svaba/build/svaba)
==15126==
==15126== 288 bytes in 1 blocks are possibly lost in loss record 203 of 375
==15126==    at 0x4844277: calloc (vg_replace_malloc.c:1675)
==15126==    by 0x40149DA: allocate_dtv (dl-tls.c:286)
==15126==    by 0x40149DA: _dl_allocate_tls (dl-tls.c:532)
==15126==    by 0x4915322: allocate_stack (allocatestack.c:622)
==15126==    by 0x4915322: pthread_create@@GLIBC_2.2.5 (pthread_create.c:660)
==15126==    by 0x2C9CA7: kt_for (kthread.c:59)
==15126==    by 0x2D009B: fml_count (bfc.c:95)
==15126==    by 0x268C1F: SeqLib::BFC::learn_correct() (in /opt/svaba/build/svaba)
==15126==    by 0x268C7C: SeqLib::BFC::Train() (in /opt/svaba/build/svaba)
==15126==    by 0x142580: runWorkItem(SeqLib::GenomicRegion const&, svabaThreadUnit&, unsigned long) (in /opt/svaba/build/svaba)
==15126==    by 0x163F09: svabaThread::runThread(void*) (in /opt/svaba/build/svaba)
==15126==    by 0x4914608: start_thread (pthread_create.c:477)
==15126==    by 0x4D9C352: clone (clone.S:95)
==15126==
==15126== LEAK SUMMARY:
==15126==    definitely lost: 0 bytes in 0 blocks
==15126==    indirectly lost: 0 bytes in 0 blocks
==15126==      possibly lost: 576 bytes in 2 blocks
==15126==    still reachable: 5,700,886,114 bytes in 1,233,001 blocks
==15126==         suppressed: 0 bytes in 0 blocks
==15126== Reachable blocks (those to which a pointer was found) are not shown.
==15126== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==15126==
==15126== Use --track-origins=yes to see where uninitialised values come from
==15126== For lists of detected and suppressed errors, rerun with: -s
==15126== ERROR SUMMARY: 175892 errors from 21 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)

And gdb bt

gdb --args svaba run -p 1 -k chr22 -t /mnt/nfs/workspace/chr22/t.chr22.bam -n /mnt/nfs/workspace/chr22/n.chr22.bam -G /mnt/nfs/workspace/ref/Homo_sapiens_assembly38.fasta
(gdb) run
...
(gdb) bt
#0  0x000055555571afdf in get_subhash (key=<synthetic pointer>, x=0x7ffea6948e00, ch=<optimized out>) at htab.c:51
#1  bfc_ch_insert (ch=<optimized out>, x=x@entry=0x7ffea6948e80, is_high=is_high@entry=1, forced=forced@entry=0) at htab.c:66
#2  0x000055555571beae in bfc_kmer_insert (x=<synthetic pointer>, tid=0, is_high=1, cs=0x7ffea7949f80) at bfc.c:57
#3  worker_count (_data=<optimized out>, k=<optimized out>, tid=0) at bfc.c:79
#4  0x000055555571598a in ktf_worker (data=0x7ffea7949ee0) at kthread.c:42
#5  0x00007ffff7ef2609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#6  0x00007ffff7ac9353 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

If you have a debug mode install that I could try happy to do so. Thanks! JB

tmrnov commented 6 months ago

Hi. any follow up about the error? I have the same problem

walaj commented 6 months ago

yes, I've had a number of reports -- working with someone here to recreate, will update as soon as possible. I think had to do with change to SeqLib to allow compilation on Mac. If you build/link with old SeqLib, may be able to get around. But will update with better answer when able.

On Thu, Apr 25, 2024 at 9:12 AM tmrnov @.***> wrote:

Hi. any follow up about the error? I have the same problem

— Reply to this email directly, view it on GitHub https://github.com/walaj/svaba/issues/135#issuecomment-2077148258, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUZ7CG36DQTU4CANH23X2TY7D6KDAVCNFSM6AAAAABEDDWO7WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZXGE2DQMRVHA . You are receiving this because you commented.Message ID: @.***>

xaviloinaz commented 6 months ago

^ Yes, that worked — thank you, Jeremiah! The issue seemed to be with fermi-lite, a dependency of SeqLib. So if you build with the 3d1eba7 commit of SeqLib and the 5bc90f8 commit of fermi-lite, things should run fine.

tmrnov commented 6 months ago

Thank you @xaviloinaz !! it also works for me now using the older commit for SeqLib and fermi-lite.

walaj commented 6 months ago

Oof -- I made a silly mistake. I found the issue in fermi-lite and fixed it. Please see the updated svaba and the updated submodules to resolve this. Will close.