pachterlab / kallisto

Near-optimal RNA-Seq quantification
https://pachterlab.github.io/kallisto
BSD 2-Clause "Simplified" License
655 stars 172 forks source link

Quant issue: multithreading errors #141

Open JakeGoodall opened 7 years ago

JakeGoodall commented 7 years ago

Hi there,

So I'm having a small issue that I can't quite seem to find an answer for. I've successfully built the index and I'm now trying to use the quant function by invoking the following:


#!/bin/bash
#SBATCH -N 2
#SBATCH --ntasks-per-node 4
#SBATCH -J Kallisto_all_morph
#SBATCH -e /users/work/jake/Kallisto_Quantification/all_morph_assembly/%j.err
#SBATCH -o /users/work/jake/Kallisto_Quantification/all_morph_assembly/%j.out

kallisto quant -i all_morph_kallisto.idx --threads=24 -o /users/work/jake/Kallisto_Quantification/all_morph_assembly/all_morph_kallisto_orig /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Bja6_striped_rFc3_clean.R1.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Bja6_striped_rFc3_clean.R2.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H11_brown_rBc4_clean.R1.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H11_brown_rBc4_clean.R2.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H16_yellow_rDc10_clean.R1.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H16_yellow_rDc10_clean.R2.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H30_oy_rAc12_clean.R1.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H30_oy_rAc12_clean.R2.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Hv10_green_rCc6_clean.R1.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Hv10_green_rCc6_clean.R2.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Hv7_pink_rEc12_clean.R1.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Hv7_pink_rEc12_clean.R2.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Sk10_skin_rEc2_clean.R1.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Sk10_skin_rEc2_clean.R2.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Sk2_white_rEc1_clean.R1.fq.gz.PwU.qtrim.fq /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Sk2_white_rEc1_clean.R2.fq.gz.PwU.qtrim.fq

When I run this I get the following error message:

[quant] fragment length distribution will be estimated from the data
[index] k-mer length: 31
[index] number of targets: 736,493
[index] number of k-mers: 247,079,167
[index] number of equivalence classes: 1,694,433
[quant] running in paired-end mode
[quant] will process pair 1: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Bja6_striped_rFc3_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Bja6_striped_rFc3_clean.R2.fq.gz.PwU.qtrim.fq
[quant] will process pair 2: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H11_brown_rBc4_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H11_brown_rBc4_clean.R2.fq.gz.PwU.qtrim.fq
[quant] will process pair 3: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H16_yellow_rDc10_clean.R1.fq.gz.PwU.qtrim.fq
                        /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H16_yellow_rDc10_clean.R2.fq.gz.PwU.qtrim.fq
[quant] will process pair 4: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H30_oy_rAc12_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H30_oy_rAc12_clean.R2.fq.gz.PwU.qtrim.fq
[quant] will process pair 5: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Hv10_green_rCc6_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Hv10_green_rCc6_clean.R2.fq.gz.PwU.qtrim.fq
[quant] will process pair 6: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Hv7_pink_rEc12_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Hv7_pink_rEc12_clean.R2.fq.gz.PwU.qtrim.fq
[quant] will process pair 7: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Sk10_skin_rEc2_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Sk10_skin_rEc2_clean.R2.fq.gz.PwU.qtrim.fq
[quant] will process pair 8: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Sk2_white_rEc1_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Sk2_white_rEc1_clean.R2.fq.gz.PwU.qtrim.fq
[quant] finding pseudoalignments for the reads ...terminate called after throwing an instance of 'std::system_error'
  what():  Enable multithreading to use std::thread: Operation not permitted
/tmp/slurmd/job14991/slurm_script: line 10:  8848 Aborted   

I notice on this thread (https://groups.google.com/forum/#!msg/kallisto-sleuth-users/64o6YJ27CRw/DmI1zl59CgAJ) the user was having a similar issue. I've tried omitting the -t argument and I've checked that all of my C++ versions, HDF5 and kallisto versions are the newest available through linux brew. I'm running this on a cluster so computing power shouldn't be an issue.

It's probably something simple I'm not seeing, any ideas?

pmelsted commented 7 years ago

Did you build kallisto yourself and in that case can you show the output of cmake? Do you see this with the binary version for linux available under releases?

It looks like the linking against pthread didn't work, as in https://stackoverflow.com/questions/17274032/c-threads-stdsystem-error-operation-not-permitted

JakeGoodall commented 7 years ago

Hi Páll,

So I think the problem goes way back to the initial install process. I'm using linuxbrew to install kallisto and its various dependencies. I went back and uninstalled Kallisto and hdf5 so that I could reinstall them and look at error messages (if present). When reinstalling I noticed the following:

[jake@garpur bin]$ brew install kallisto
==> Installing kallisto from homebrew/science
==> Installing dependencies for homebrew/science/kallisto: hdf5
==> Installing homebrew/science/kallisto dependency: hdf5
==> Downloading https://www.hdfgroup.org/ftp/HDF5/releases/hdf5-1.10/hdf5-1.10.1/src/hdf5-1.10.1.tar.bz2
Already downloaded: /users/home/jake/.cache/Homebrew/hdf5-1.10.1.tar.bz2
==> autoreconf -fiv
==> ./configure --prefix=/users/home/jake/.linuxbrew/Cellar/hdf5/1.10.1 --enable-build-mode=production --with-zlib=/users/home/jake/opt/zlib --with-szlib=/users/home/jake/opt/szip --enable-static=yes --enable-shared=yes --enable-c
==> make
==> make install
🍺  /users/home/jake/.linuxbrew/Cellar/hdf5/1.10.1: 207 files, 17.8MB, built in 4 minutes 11 seconds
==> Installing homebrew/science/kallisto
==> Downloading https://linuxbrew.bintray.com/bottles-science/kallisto-0.43.1_1.x86_64_linux.bottle.tar.gz
Already downloaded: /users/home/jake/.cache/Homebrew/kallisto-0.43.1_1.x86_64_linux.bottle.tar.gz
==> Pouring kallisto-0.43.1_1.x86_64_linux.bottle.tar.gz
**_Warning: homebrew/science/kallisto dependency hdf5 was built with a different C++ standard
library (libstdc++ from gcc-5). This may cause problems at runtime._**
🍺  /users/home/jake/.linuxbrew/Cellar/kallisto/0.43.1_1: 5 files, 399.2KB

So it appears that the issue is partially derived from linuxbrew using gcc-5 library libstdc++ to build hdf5. I went back and checked my gcc installs and realised I had 4.8.5 (which c++ would usually call on) but also an independent 5.3.0 version. So I uninstalled gcc_v5.3.0, kallisto and hdf5 once again and then reinstalled kallisto and the hdf5 dependency through linuxbrew which resolved the previous install error message.

[jake@garpur bin]$ brew install kallisto
==> Installing kallisto from homebrew/science
==> Installing dependencies for homebrew/science/kallisto: hdf5
==> Installing homebrew/science/kallisto dependency: hdf5
==> Downloading https://www.hdfgroup.org/ftp/HDF5/releases/hdf5-1.10/hdf5-1.10.1/src/hdf5-1.10.1.tar.bz2
Already downloaded: /users/home/jake/.cache/Homebrew/hdf5-1.10.1.tar.bz2
==> autoreconf -fiv
==> ./configure --prefix=/users/home/jake/.linuxbrew/Cellar/hdf5/1.10.1 --enable-build-mode=production --with-zlib=/users/home/jake/opt/zlib --with-szlib=/users/home/jake/opt/szip --enable-static=yes --enable-shared=yes --enable-c
==> make
==> make install
🍺  /users/home/jake/.linuxbrew/Cellar/hdf5/1.10.1: 207 files, 17.4MB, built in 3 minutes 15 seconds
==> Installing homebrew/science/kallisto
==> Downloading https://linuxbrew.bintray.com/bottles-science/kallisto-0.43.1_1.x86_64_linux.bottle.tar.gz
Already downloaded: /users/home/jake/.cache/Homebrew/kallisto-0.43.1_1.x86_64_linux.bottle.tar.gz
==> Pouring kallisto-0.43.1_1.x86_64_linux.bottle.tar.gz
🍺  /users/home/jake/.linuxbrew/Cellar/kallisto/0.43.1_1: 5 files, 399.1KB  

Hoping it was all resolved I re-ran the quantification step but unfortunately received the same error message as previous:

[quant] fragment length distribution will be estimated from the data
[index] k-mer length: 31
[index] number of targets: 736,493
[index] number of k-mers: 247,079,167
[index] number of equivalence classes: 1,694,433
[quant] running in paired-end mode
[quant] will process pair 1: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Bja6_striped_rFc3_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Bja6_striped_rFc3_clean.R2.fq.gz.PwU.qtrim.fq
[quant] will process pair 2: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H11_brown_rBc4_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H11_brown_rBc4_clean.R2.fq.gz.PwU.qtrim.fq
[quant] will process pair 3: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H16_yellow_rDc10_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H16_yellow_rDc10_clean.R2.fq.gz.PwU.qtrim.fq
[quant] will process pair 4: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H30_oy_rAc12_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/H30_oy_rAc12_clean.R2.fq.gz.PwU.qtrim.fq
[quant] will process pair 5: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Hv10_green_rCc6_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Hv10_green_rCc6_clean.R2.fq.gz.PwU.qtrim.fq
[quant] will process pair 6: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Hv7_pink_rEc12_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Hv7_pink_rEc12_clean.R2.fq.gz.PwU.qtrim.fq
[quant] will process pair 7: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Sk10_skin_rEc2_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Sk10_skin_rEc2_clean.R2.fq.gz.PwU.qtrim.fq
[quant] will process pair 8: /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Sk2_white_rEc1_clean.R1.fq.gz.PwU.qtrim.fq
                             /users/work/jake/Trinity_Assemblies/Trimmomatic_reads_all_morph/trinity/Sk2_white_rEc1_clean.R2.fq.gz.PwU.qtrim.fq
**_[quant] finding pseudoalignments for the reads ...terminate called after throwing an instance of 'std::system_error'
  what():  Enable multithreading to use std::thread: Operation not permitted_**

So moving forward slightly, but not quite there yet. I thought I'd try something different and again uninstall Kallisto through linuxbrew. I then downloaded Kallisto from source and followed the install guide. I had to specify the location of zlib in my cmake script, i.e. cmake -DCMAKE_INSTALL_PREFIX:PATH=$HOME -DZLIB_LIBRARY=/users/home/jake/.linuxbrew/Cellar/zlib/1.2.11/lib ..

But after all that the program installed correctly. I ran a test data set through the quantification step and it completed fine (and very quickly might I add). So it would seem that my issue has been resolved. I guess there may have been something in my build environment and how it interacts with linuxbrew, or the linuxbrew version of kallisto itself that wasn't working as intended.

matrs commented 7 years ago

Just add that the--threads option isn't working for me either. I used one instance of the program installed in my cluster (I didn't install it) and two installed for me, one from conda and the other is the binary from the official page. All the same version:

kallisto version
kallisto, version 0.43.1

./kallisto_linux-v0.43.1/kallisto quant -i Kallisto_Fomme1.idx -o Kallisto_Out -b 100 --threads=8 forward_reads.fastq reverse_reads.fastq

I'm using a small index of 324M and i'm running an interactive session in a cluster, where I don't have problems using multiple processors if a ask for them (as I usually do). I checked the %CPU usage with top and it never went above 120%. I'm aware that at the beginning, to read the index, kallisto doesn't use multiple threads. When I test multi-threaded programs I do see an increment in %CPU as I increase the number of threads to run them.

I haven't tested building kallisto from the source, but it would be nice for the official compiled version to work as expected.