jsh58 / Genrich

Detecting sites of genomic enrichment
MIT License
183 stars 27 forks source link

Segmentation fault (core dumped) #7

Closed ShuangHe33 closed 5 years ago

ShuangHe33 commented 5 years ago

Hi,thanks for the wonderful software. commond:Genrich -t test.rdrm.bam -o test/test.narrowpeak -j error:Segmentation fault (core dumped)

[debug01: /lustre1/cxu_pkuhpc/hs/test/Genrich/] $ gdb Genrich GNU gdb (GDB) Red Hat Enterprise Linux (7.2-83.el6) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/... Reading symbols from /lustre1/cxu_pkuhpc/program/Genrich-master/Genrich...done.

(gdb) run Starting program: /lustre1/cxu_pkuhpc/program/Genrich-master/Genrich warning: Could not load shared library symbols for linux-vdso.so.1. Do you need "set solib-search-path" or "set sysroot"? Error! Need input/output files Usage: ./Genrich -t -o [optional arguments] Required arguments: -t Input SAM/BAM file(s) for experimental sample(s) -o Output peak file (in ENCODE narrowPeak format) Optional I/O arguments: -c Input SAM/BAM file(s) for control sample(s) -f Output bedgraph-ish file for p/q values -k Output bedgraph-ish file for pileups and p-values -b Output BED file for reads/fragments/intervals -R Output file for PCR duplicates (only with -r) Filtering options: -r Remove PCR duplicates -e Comma-separated list of chromosomes to exclude -E Input BED file(s) of genomic regions to exclude -m Minimum MAPQ to keep an alignment (def. 0) -s Keep sec alns with AS >= bestAS - (def. 0) -y Keep unpaired alignments (def. false) -w Keep unpaired alns, lengths changed to -x Keep unpaired alns, lengths changed to paired avg Options for ATAC-seq: -j Use ATAC-seq mode (def. false) -d Expand cut sites to bp (def. 100) Options for peak-calling: -q Maximum q-value (FDR-adjusted p-value; def. 0.05) -p Maximum p-value (overrides -q if set) -a Minimum AUC for a peak (def. 20.0) -l Minimum length of a peak (def. 0) -g Maximum distance between signif. sites (def. 100) Other options: -X Skip peak-calling -P Call peaks directly from a log file (-f) -z Option to gzip-compress output(s) -v Option to print status updates/counts to stderr

Program exited with code 01. Missing separate debuginfos, use: debuginfo-install zlib-1.2.3-29.el6.x86_64

my data is alignment by bowtie2(pair-end).Do you have any ideas for the casue of the error? Thank you in advance.

ShuangHe33 commented 5 years ago

$samtools view -h test.sort.bam | head -1000 > test.sam $samtools sort -n test.sam -o test.sort.sam -O sam $Genrich -t test.sort.sam -o test.peak -j -y -f test.log' $head test.peak chr10 3099933 3100169 peak_0 1000 . 5322.708984 33.154343 27.771778 59 chr10 3100381 3100585 peak_1 1000 . 5028.774902 33.154343 27.771778 61 chr10 3100703 3100803 peak_2 1000 . 2303.317383 29.395199 24.334204 50 chr10 3100925 3101101 peak_3 1000 . 4641.216309 37.145863 31.252241 88

when I use the small sam file ,it works well. it seems that my rawdata is too largre. And something like Stack overflow out may happen.

jsh58 commented 5 years ago

It is hard to say what the cause of the segfault is without access to your dataset. It is true that a stack overflow will appear as a segfault, so if that is the cause, you may want to increase the stack size. Also, with gdb, after run you need to include the command-line arguments for Genrich (-t test.rdrm.bam -o test/test.narrowpeak -j).

ShuangHe33 commented 5 years ago

It is hard to say what the cause of the segfault is without access to your dataset. It is true that a stack overflow will appear as a segfault, so if that is the cause, you may want to increase the stack size. Also, with gdb, after run you need to include the command-line arguments for Genrich (-t test.rdrm.bam -o test/test.narrowpeak -j).

Thank you for your reply. I used gdb according to your suggestion. (gdb) run -t test.sort.bam -o test.peak -j Starting program: /lustre1/cxu_pkuhpc/program/Genrich-master/Genrich -t test.sort.bam -o test.peak -j warning: Could not load shared library symbols for linux-vdso.so.1. Do you need "set solib-search-path" or "set sysroot"?

Program received signal SIGSEGV, Segmentation fault. 0x000000000040efad in checkBAM () at Genrich.c:5049 5049 char m = gzgetc(in.gzf);

And I submited the script to the fat node with 2cpus(1T,24cores),but it still returned the same error. I don't know how to increase the stack size, do I need to change the programme code? Thank you in advance.

jsh58 commented 5 years ago

Unless there's something wrong with in.gzf (which you should check with gdb), it seems your computer does not understand gzgetc, which is a library function in zlib. This suggests an error in compilation with zlib. Did you compile Genrich with zlib 1.2.8?

This also would explain why the program failed on your large BAM file, but succeeded on your small SAM file, which is not gzip-compressed. Does the program fail on a small BAM file?

ShuangHe33 commented 5 years ago

This also would explain why the program failed on your large BAM file, but succeeded on your small SAM file, which is not gzip-compressed. Does the program fail on a small BAM file?

Thank you for your kind reply. Yes, you're totally right,the program failed on a small BAM file. Could you please tell me how to check whether I compiled Genrich with zlib or not? By the way,my computer's zlib version is older than 1.2.8.Thank you so much. The following is dynamic dependencies of Genrich. $ldd ./Genrich linux-vdso.so.1 (0x00007fff6e7a0000) libz.so.1 => /lib64/libz.so.1 (0x0000003713200000) libm.so.6 => /lib64/libm.so.6 (0x0000003712e00000) libc.so.6 => /lib64/libc.so.6 (0x0000003712200000) /lib64/ld-linux-x86-64.so.2 (0x0000003711e00000)

ShuangHe33 commented 5 years ago

When I use the newer zlib 1.2.8 , it still failed and returned the same error on the small BAM file.

$ldd ./Genrich linux-vdso.so.1 (0x00007fffaf7ff000) libz.so.1 => /apps/usr/zlib-1.2.8/lib/libz.so.1 (0x00002b308bafb000) libm.so.6 => /lib64/libm.so.6 (0x0000003712e00000) libc.so.6 => /lib64/libc.so.6 (0x0000003712200000) /lib64/ld-linux-x86-64.so.2 (0x0000003711e00000)

jsh58 commented 5 years ago

I suggest you convert the BAM to SAM, or, to save disk space, pipe the SAM to Genrich directly: samtools view -h test.rdrm.bam | ./Genrich -t- -o test/test.narrowpeak -j

ShuangHe33 commented 5 years ago

I suggest you convert the BAM to SAM, or, to save disk space, pipe the SAM to Genrich directly: samtools view -h test.rdrm.bam | ./Genrich -t- -o test/test.narrowpeak -j

Thank you for your suggestion. I have already tried this and it worked well.

JPascualAnaya commented 5 years ago

Hi there, I am having a segfault problem, but without the (core dump) specified. I'm running Genrich on two BAM files (replicates) from an ATAC-seq. Each file is around 7G

The exact command and stderr:

$ Genrich -t EbNo3-A1_trimgalore_A1_vs_eb3.2_bowtie2.paired.mapped.q30.rmdup.nsorted.bam,EbNo3-A2_trimgalore_A2_vs_eb3.2_bowtie2.paired.mapped.q30.rmdup.nsorted.bam -j -o EbNo3-A_trimgalore_vs_eb3.2_bowtie2.paired.mapped.q30.rmdup.genrich.narrowPeak -f EbNo3-A_trimgalore_vs_eb3.2_bowtie2.paired.mapped.q30.rmdup.genrich.log -E $BOWTIE2_INDEXES/effective_size.Genrich.bed
Segmentation fault

Like @ShuangHe33 I've tried to debug it:

$ ~/Software/gdb-8.2.1/bin/gdb Genrich
GNU gdb (GDB) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from Genrich...done.
(gdb) run -t EbNo3-A1_trimgalore_A1_vs_eb3.2_bowtie2.paired.mapped.q30.rmdup.nsorted.bam,EbNo3-A2_trimgalore_A2_vs_eb3.2_bowtie2.paired.mapped.q30.rmdup.nsorted.bam -j -o EbNo3-A_trimgalore_vs_eb3.2_bowtie2.paired.mapped.q30.rmdup.genrich.narrowPeak -f EbNo3-A_trimgalore_vs_eb3.2_bowtie2.paired.mapped.q30.rmdup.genrich.log -E $BOWTIE2_INDEXES/effective_size.Genrich.bed
Starting program: /home/champi/Software/bin/Genrich -t EbNo3-A1_trimgalore_A1_vs_eb3.2_bowtie2.paired.mapped.q30.rmdup.nsorted.bam,EbNo3-A2_trimgalore_A2_vs_eb3.2_bowtie2.paired.mapped.q30.rmdup.nsorted.bam -j -o EbNo3-A_trimgalore_vs_eb3.2_bowtie2.paired.mapped.q30.rmdup.genrich.narrowPeak -f EbNo3-A_trimgalore_vs_eb3.2_bowtie2.paired.mapped.q30.rmdup.genrich.log -E $BOWTIE2_INDEXES/effective_size.Genrich.bed

Program received signal SIGSEGV, Segmentation fault.
0x0000000000407b25 in saveInterval (c=0x2aaaab2b1650, start=<optimized out>, end=251, qname=0x629420 "K00114:809:HTL3FBBXX:3:1101:13646:22186", count=<optimized out>, 
    bed=bed@entry=..., bedOpt=false, gzOut=false, ctrl=false, sample=0, errCount=0x7fffffffc8dc, verbose=false) at Genrich.c:2537
2537      if (c->diff->cov[start] == INT16_MAX) {
(gdb)

But have no idea about what it means...

I'm going to try feeding it with SAM files, but since I'm using Genrich on multiple files at the same time, I can't use directly the stdout from samtools, and need to convert BAM --> SAM, which is not ideal due to space limitations.

Any idea of the reason for the Segmentation fault?

Cheers, Juan

jsh58 commented 5 years ago

Juan,

Can you please open a New issue? This is unrelated to a failure in the zlib library.

Thanks,

John Gaspar