COMBINE-lab / salmon

šŸŸ šŸ£ šŸ± Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using selective alignment
https://combine-lab.github.io/salmon
GNU General Public License v3.0
776 stars 164 forks source link

salmon v1.4.0 executable compiled using release mode(-DCMAKE_BUILD_TYPE=Release) produce segmentation fault #609

Open kai2june opened 3 years ago

kai2june commented 3 years ago

Is the bug primarily related to salmon (bulk mode) or alevin (single-cell mode)? salmon (buik mode)

Describe the bug A clear and concise description of what the bug is. I compiled the code (salmon v1.4.0) and produced the executable. Then when I try to "salmon index -t gentrome.fa.gz -d decoys.txt -p 12 -i salmon_index --gencode" (transcriptome and genome from your tutorial: https://combine-lab.github.io/alevin-tutorial/2019/selective-alignment/), a segmentation fault occurs. (gdb backtrack is provided below.)

To Reproduce Steps and data to reproduce the behavior:

  1. run a docker container using ubuntu:18.04 as image

  2. (packages I installed) apt-get install -y gcc g++ make wget git curl libtbb2-dbg libtbb-dev unzip zlib1g-dev libcurl4-openssl-dev liblzma-dev libbz2-dev libcereal-dev libgff-dev libpkgconfig-perl libjemalloc-dev / gcc (Ubuntu 7.5.0-3ubuntu118.04) 7.5.0 g++ (Ubuntu 7.5.0-3ubuntu118.04) 7.5.0 GNU Make 4.1 / wget https://github.com/Kitware/CMake/releases/download/v3.13.4/cmake-3.13.4-Linux-x86_64.sh /cmake version 3.13.4/

  3. git clone https://github.com/COMBINE-lab/salmon.git I'm at the top commit: commit 0813a0a (HEAD -> master, tag: v1.4.0, origin/master, origin/HEAD)

  4. In directory salmon/build, I type

    cmake -DFETCH_BOOST=TRUE -DTBB_INSTALL_DIR=/usr/include -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../stage .. make make install

  5. following your tutorial https://combine-lab.github.io/alevin-tutorial/2019/selective-alignment/

    wget ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M23/gencode.vM23.transcripts.fa.gz wget ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M23/GRCm38.primary_assembly.genome.fa.gz grep "^>" <(gunzip -c GRCm38.primary_assembly.genome.fa.gz) | cut -d " " -f 1 > decoys.txt sed -i.bak -e 's/>//g' decoys.txt cat gencode.vM23.transcripts.fa.gz GRCm38.primary_assembly.genome.fa.gz > gentrome.fa.gz salmon index -t gentrome.fa.gz -d decoys.txt -p 12 -i salmon_index --gencode

===========Then I get segmentation fault image

  1. gdb salmon It seemed to crash at these functions: fixFasta(), fixFastaMain() image

Specifically, please provide at least the following information:

Expected behavior A clear and concise description of what you expected to happen. Producing directory that contains indexed files to be applied to "salmon quant" command.

Screenshots If applicable, add screenshots or terminal output to help explain your problem. 2 pictures: a segmentation fault screenshot and a gdb backtrack screenshot in "To Reproduce" section.

Desktop (please complete the following information):

Additional context Thanks for your help in advance. Best, kai2june

rob-p commented 3 years ago

Hi @kai2june,

Thank you for the detailed report! It's interesting because (a) those functions aren't doing anything too exotic and (b) CentOS is the OS we use on our continuous integration. We'll try and get a better handle of what is going on here. In the mean time, could you tell us if you see the same behavior with the pre-compiled binary available from the downloads page?

P.S. One other thing worth trying. We've noticed that compiler support for interprocedural optimization isn't terrific. You can try building salmon without this option by passing -DNO_IPO=TRUE as an additional cmake flag.

kai2june commented 3 years ago

Hello @rob-p ,

  1. For pre-compiled binary, "salmon index" complete successfully, but "salmon quant" failed to find the read file. ("ls" command confirms the existence of the read file SRR6269049_2.fastq) image

  2. cmake -DFETCH_BOOST=TRUE -DTBB_INSTALL_DIR=/usr/include -DCMAKE_BUILD_TYPE=Release -DNO_IPO=TRUE -DCMAKE_INSTALL_PREFIX=../stage .. I add the additional "-DNO_IPO" flag, but "salmon index -t gentrome.fa.gz -d decoys.txt -p 12 -i salmon_index --gencode"(same as To Reproduce point 5) still crashed at fixFasta(), fixFastaMain() with segmentation fault.

  3. I had another issue posted yesterday reporting that the Debug mode is unabled to be compiled successfully, and I'm wondering if there's a resolution for the problem. Thank you:)) title: [salmon v1.4.0 -DCMAKE_BUILD_TYPE=Debug produce compile error: -pg and -fomit-frame-pointer are incompatible #608]

rob-p commented 3 years ago

Ho @kai2june,

For the precompiled binary, it looks like its parsing an extra space in the second fastq file name. Can you remove that extra space?

For the debug mode, I'll type out my thought / suggestion when at my computer :).

Best, Rob

kai2june commented 3 years ago

Hi @rob-p , That's my mistake. "salmon quant" completes successfully after removing that extra space now.

Thank you in advance for the help with debug mode. :))

rob-p commented 3 years ago

Hi @kai2june,

I'm glad to hear the pre-compiled one works. To try and compile in debug mode, I suggest the following. Let salmon be the top level directory where you checked out the repository. And assume you're not in a fresh checkout (i.e. you already tried to build and got this error). Look at the file:

salmon/external/pufferfish/CMakeLists.txt

on line 131 you should see the following:

set(DEBUG_FLAGS "-D__STDC_FORMAT_MACROS;-DSTX_NO_STD_STRING_VIEW;-pg;-g;-gstabs")

try removing the -pg from this so it reads

set(DEBUG_FLAGS "-D__STDC_FORMAT_MACROS;-DSTX_NO_STD_STRING_VIEW;-g;-gstabs")

then try to compile again and see if that is able to complete successfully.

kai2june commented 3 years ago

Hi @rob-p , After removing "-pg" flag in "salmon/external/pufferfish/CMakeLists.txt", it's able to be compiled successfully now using Debug mode.

To reproduce (in salmon/build directory): ISSUE 1: The second test failed, I'm wondering whether this should happen or not.

/root/cmake-3.13.4-Linux-x86_64/bin/cmake -DFETCH_BOOST=TRUE -DTBB_INSTALL_DIR=/usr/include -DCMAKE_BUILD_TYPE=Debug -DNO_IPO=TRUE -DCMAKE_INSTALL_PREFIX=../stage .. make // to get /root/salmon/external/pufferfish/CMakeLists.txt file vim /root/salmon/external/pufferfish/CMakeLists.txt // remove the "-pg" flag on line 131 make // successfully compiled after removing "-pg" flag make install make test second_test_failed

(in /mammoth/salmon_data directory): ISSUE 2: segmentation fault occurs after "wrote [count] cleaned references" (the same place as Release mode)

/root/salmon/stage/bin/salmon index -t gentrome.fa.gz -d decoys.txt -p 12 -i salmon_index --gencode (data from your tutorial https://combine-lab.github.io/alevin-tutorial/2019/selective-alignment/) image

gdb /root/salmon/stage/bin/salmon core.23591 (it seems to crash at cereal::OutputArchive, fixFasta, fixFastaMain, etc.) image