Oshlack / JAFFA

JAFFA is a multi-step pipeline that takes either raw RNA-Seq reads, or pre-assembled transcripts, then searches for gene fusions
https://github.com/Oshlack/JAFFA/wiki
Other
87 stars 21 forks source link

Error in filter_transcripts #75

Closed EduardoGCCM closed 2 years ago

EduardoGCCM commented 2 years ago

Hi, First, many thanks for developing this tool and the nice tutorial in the wiki.

At the moment I am trying to run the JAFFAL pipeline on the demo (simulated) data but I am getting an error at the Stage filter_transcripts:

$ JAFFA-version-2.2/tools/bin/bpipe run JAFFA-version-2.2/JAFFAL.groovy LongReadFusionSimulation/ONT_fus_sim_80err.fastq.gz

                    Starting Pipeline at 2022-04-07 03:51                               

====================================================================================================

========================================= Stage run_check ========================================== Running JAFFA version 2.2 Checking for required data files... /home/ssd/egomez/ontTAS/JAFFA-version-2.2/hg38_genCode22.fa /home/ssd/egomez/ontTAS/JAFFA-version-2.2/hg38_genCode22.tab /home/ssd/egomez/ontTAS/JAFFA-version-2.2/known_fusions.txt /home/ssd/egomez/ontTAS/JAFFA-version-2.2/hg38.fa /home/ssd/egomez/ontTAS/JAFFA-version-2.2/Masked_hg38.1.bt2 /home/ssd/egomez/ontTAS/JAFFA-version-2.2/hg38_genCode22.1.bt2 All looking good

=============================== Stage get_fasta (ONT_fus_sim_80err) ================================ java -ea -Xmx200m -cp /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bbmap/current/ jgi.ReformatReads ignorebadquality=t in=LongReadFusionSimulation/ONT_fus_sim_80err.fastq.gz out=ONT_fus_sim_80err.fastq/ONT_fus_sim_80err.fastq.fasta threads=16 Executing jgi.ReformatReads [ignorebadquality=t, in=LongReadFusionSimulation/ONT_fus_sim_80err.fastq.gz, out=ONT_fus_sim_80err.fastq/ONT_fus_sim_80err.fastq.fasta, threads=16] Set threads to 16 Input is being processed as unpaired Input: 17819 reads 38238518 bases Output: 17819 reads (100.00%) 38238518 bases (100.00%) Time: 0.756 seconds. Reads Processed: 17819 23.58k reads/sec Bases Processed: 38238k 50.59m bases/sec

========================= Stage minimap2_transcriptome (ONT_fus_sim_80err) ========================= [M::mm_idx_gen::6.4311.53] collected minimizers [M::mm_idx_gen::7.0172.24] sorted minimizers [M::main::7.0172.24] loaded/built the index for 195175 target sequence(s) [M::mm_mapopt_update::7.3902.17] mid_occ = 110 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 195175 [M::mm_idx_stat::7.6202.14] distinct minimizers: 16950829 (43.60% are singletons); average occurrences: 3.243; average spacing: 5.406 [M::worker_pipeline::12.1496.52] mapped 17819 sequences [M::main] Version: 2.17-r941 [M::main] CMD: /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/minimap2 -t 16 -x map-ont -c /home/ssd/egomez/ontTAS/JAFFA-version-2.2/hg38_genCode22.fa ONT_fus_sim_80err.fastq/ONT_fus_sim_80err.fastq.fasta [M::main] Real time: 12.281 sec; CPU: 79.350 sec; Peak RSS: 2.738 GB

=========================== Stage filter_transcripts (ONT_fus_sim_80err) =========================== /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table: /lib64/libstdc++.so.6: version GLIBCXX_3.4.20' not found (required by /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table) /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table: /lib64/libstdc++.so.6: versionGLIBCXX_3.4.21' not found (required by /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table) ERROR: Command failed with exit status = 1 :

/home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table ONT_fus_sim_80err.fastq/ONT_fus_sim_80err.fastq.paf 1000 /home/ssd/egomez/ontTAS/JAFFA-version-2.2/hg38_genCode22.tab > ONT_fus_sim_80err.fastq/ONT_fus_sim_80err.fastq.txt

========================================= Pipeline Failed ==========================================

One or more parallel stages aborted. The following messages were reported:

Branch ONT_fus_sim_80err.fastq in stage Unknown reported message:

Command failed with exit status = 1 :

/home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table ONT_fus_sim_80err.fastq/ONT_fus_sim_80err.fastq.paf 1000 /home/ssd/egomez/ontTAS/JAFFA-version-2.2/hg38_genCode22.tab > ONT_fus_sim_80err.fastq/ONT_fus_sim_80err.fastq.txt

Use 'bpipe errors' to see output from failed commands.

Do you have any idea of what might be causing this error?

ls -l outputs this in the output folder:

total 79024 -rw-rw-r--. 1 egomez egomez 41627650 Apr 7 03:51 ONT_fus_sim_80err.fastq.fasta -rw-rw-r--. 1 egomez egomez 39286391 Apr 7 03:52 ONT_fus_sim_80err.fastq.paf -rw-rw-r--. 1 egomez egomez 0 Apr 7 03:52 ONT_fus_sim_80err.fastq.txt

And this in the JAFFA directory

total 13269464 -rw-rw-r--. 1 egomez egomez 35147 Sep 11 2021 COPYING -rw-rw-r--. 1 egomez egomez 5180566100 Apr 7 01:54 Gencode22.V2.tar.gz -rwxrwxr-x. 1 egomez egomez 3034 Sep 11 2021 JAFFAL.groovy -rwxrwxr-x. 1 egomez egomez 1149 Sep 11 2021 JAFFA_assembly.groovy -rwxrwxr-x. 1 egomez egomez 1447 Sep 11 2021 JAFFA_direct.groovy -rwxrwxr-x. 1 egomez egomez 1680 Sep 11 2021 JAFFA_hybrid.groovy -rwxrwxr-x. 1 egomez egomez 20628 Sep 11 2021 JAFFA_stages.groovy -rw-rw-r--. 1 egomez egomez 818 Sep 11 2021 LICENSE -rwxrwxr--. 1 egomez egomez 984246548 Mar 4 2015 Masked_hg38.1.bt2 -rwxrwxr--. 1 egomez egomez 732332820 Mar 4 2015 Masked_hg38.2.bt2 -rwxrwxr--. 1 egomez egomez 2698631 Mar 4 2015 Masked_hg38.3.bt2 -rwxrwxr--. 1 egomez egomez 732332815 Mar 4 2015 Masked_hg38.4.bt2 -rwxrwxr--. 1 egomez egomez 984246548 Mar 4 2015 Masked_hg38.rev.1.bt2 -rwxrwxr--. 1 egomez egomez 732332820 Mar 4 2015 Masked_hg38.rev.2.bt2 -rwxrwxr-x. 1 egomez egomez 618 Sep 11 2021 README -rw-rw-r--. 1 egomez egomez 713 Sep 11 2021 README.md -rwxrwxr-x. 1 egomez egomez 2284 Sep 11 2021 assemble.sh -rwxrwxr-x. 1 egomez egomez 2458 Sep 11 2021 compile_results.R drwxrwxr-x. 2 egomez egomez 141 Sep 11 2021 cwl drwxrwxr-x. 2 egomez egomez 177 Sep 11 2021 docker -rwxrwxr-x. 1 egomez egomez 1636 Sep 11 2021 get_spanning_reads.R -rw-rw-r--. 1 egomez egomez 3273481150 Sep 11 2020 hg38.fa -rwxrwxr--. 1 egomez egomez 130890956 Jun 19 2015 hg38_genCode22.1.bt2 -rwxrwxr--. 1 egomez egomez 74291416 Jun 19 2015 hg38_genCode22.2.bt2 -rwxrwxr--. 1 egomez egomez 1756646 Jun 19 2015 hg38_genCode22.3.bt2 -rwxrwxr--. 1 egomez egomez 74291411 Jun 19 2015 hg38_genCode22.4.bt2 -rwxrwxr--. 1 egomez egomez 322074390 Jun 19 2015 hg38_genCode22.fa -rwxrwxr--. 1 egomez egomez 130890956 Jun 19 2015 hg38_genCode22.rev.1.bt2 -rwxrwxr--. 1 egomez egomez 74291416 Jun 19 2015 hg38_genCode22.rev.2.bt2 -rwxrwxr--. 1 egomez egomez 42879375 Jun 19 2015 hg38_genCode22.tab -rw-r--r--. 1 egomez egomez 36990104 Mar 5 2020 hg38_genCode22_blast.nhr -rw-r--r--. 1 egomez egomez 2342184 Mar 5 2020 hg38_genCode22_blast.nin -rw-r--r--. 1 egomez egomez 74413424 Mar 5 2020 hg38_genCode22_blast.nsq -rwxrwxr-x. 1 egomez egomez 6529 Sep 11 2021 install_linux64.sh -rw-rw-r--. 1 egomez egomez 408823 Sep 11 2021 known_fusions.txt -rwxrwxr-x. 1 egomez egomez 17542 Sep 11 2021 make_final_table.R drwxrwxr-x. 2 egomez egomez 86 Sep 11 2021 scripts drwxrwxr-x. 2 egomez egomez 247 Sep 11 2021 src drwxrwxr-x. 13 egomez egomez 4096 Apr 7 03:32 tools -rw-rw-r--. 1 egomez egomez 1355 Apr 7 03:32 tools.groovy

Many thanks for your help!

nadiadavidson commented 2 years ago

Hi Eduardo, This error looks like process_transcriptome_align_table may have been compiled under different conditions than how it's now being run. Are you running on the same machine you installed on? If you installed on a cluster head node and are running on a different node there might be some difference in the libraries which are accessible?

Happy to try and help if you let me know more of the details on how JAFFAL was installed/compiled and run. What kind of system you are working on. I've had this error come up with other software packages too.

Cheers, Nadia.

EduardoGCCM commented 2 years ago

Hi Nadia, Many thanks for your fast response.

I am running JAFFAL in a cluster. I installed R and a newer version of gcc (the cluster has 4.8.5 version and I was getting a different error when running JAFFAL) on a conda environment. This are the packages installed in the conda environment:

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 1_gnu conda-forge _r-mutex 1.0.1 anacondar_1 conda-forge bioconductor-biocgenerics 0.18.0 r3.2.2_0 bioconda bioconductor-iranges 2.4.8 0 bioconda bioconductor-s4vectors 0.8.11 0 bioconda cairo 1.16.0 ha61ee94_1011 conda-forge expat 2.4.8 h27087fc_0 conda-forge font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge font-ttf-inconsolata 3.000 h77eed37_0 conda-forge font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge font-ttf-ubuntu 0.83 hab24e00_0 conda-forge fontconfig 2.14.0 h8e229c2_0 conda-forge fonts-conda-ecosystem 1 0 conda-forge fonts-conda-forge 1 0 conda-forge freetype 2.10.4 h0708190_1 conda-forge fribidi 1.0.10 h36c2ea0_0 conda-forge gcc-5 5.4.0 2 daleydeng gettext 0.19.8.1 h73d1719_1008 conda-forge gmp 6.2.1 h58526e2_0 conda-forge graphite2 1.3.13 h58526e2_1001 conda-forge harfbuzz 4.2.0 hf9f4e7c_1 conda-forge icu 70.1 h27087fc_0 conda-forge isl 0.17.1 0 daleydeng jbig 2.1 h7f98852_2003 conda-forge jpeg 9e h7f98852_0 conda-forge lerc 3.0 h9c3ff4c_0 conda-forge libdeflate 1.10 h7f98852_0 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc 7.2.0 h69d50b8_2 conda-forge libgcc-ng 11.2.0 h1d223b6_14 conda-forge libglib 2.70.2 h174f98d_4 conda-forge libgomp 11.2.0 h1d223b6_14 conda-forge libiconv 1.16 h516909a_0 conda-forge libpng 1.6.37 h21135ba_2 conda-forge libstdcxx-ng 11.2.0 he4da1e4_14 conda-forge libtiff 4.3.0 h542a066_3 conda-forge libuuid 2.32.1 h7f98852_1000 conda-forge libwebp-base 1.2.2 h7f98852_1 conda-forge libxcb 1.13 h7f98852_1004 conda-forge libxml2 2.9.12 h22db469_2 conda-forge libzlib 1.2.11 h166bdaf_1014 conda-forge lz4-c 1.9.3 h9c3ff4c_1 conda-forge mpc 1.2.1 h9f54685_0 conda-forge mpfr 4.1.0 h9202a9a_1 conda-forge ncurses 6.3 h9c3ff4c_0 conda-forge pango 1.50.6 hbd2fdc8_0 conda-forge pcre 8.45 h9c3ff4c_0 conda-forge pixman 0.40.0 h36c2ea0_0 conda-forge pthread-stubs 0.4 h36c2ea0_1001 conda-forge r 3.2.2 0 r-base 3.2.2 0 r-boot 1.3_17 r3.2.2_0a r-class 7.3_14 r3.2.2_0a r-cluster 2.0.3 r3.2.2_0a r-codetools 0.2_14 r3.2.2_0a r-foreign 0.8_66 r3.2.2_0a r-kernsmooth 2.23_15 r3.2.2_0a r-lattice 0.20_33 r3.2.2_0a r-mass 7.3_45 r3.2.2_0a r-matrix 1.2_2 r3.2.2_0a r-mgcv 1.8_9 r3.2.2_0a r-nlme 3.1_122 r3.2.2_0a r-nnet 7.3_11 r3.2.2_0a r-recommended 3.2.2 r3.2.2_0 r-rpart 4.1_10 r3.2.2_0a r-spatial 7.3_11 r3.2.2_0a r-survival 2.38_3 r3.2.2_0a readline 8.1 h46c0cb4_0 conda-forge tk 8.6.12 h27826a3_0 conda-forge xorg-kbproto 1.0.7 h7f98852_1002 conda-forge xorg-libice 1.0.10 h7f98852_0 conda-forge xorg-libsm 1.2.3 hd9c2040_1000 conda-forge xorg-libx11 1.7.2 h7f98852_0 conda-forge xorg-libxau 1.0.9 h7f98852_0 conda-forge xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge xorg-libxext 1.3.4 h7f98852_1 conda-forge xorg-libxrender 0.9.10 h7f98852_1003 conda-forge xorg-renderproto 0.11.1 h7f98852_1002 conda-forge xorg-xextproto 7.3.0 h7f98852_1002 conda-forge xorg-xproto 7.0.31 h7f98852_1007 conda-forge xz 5.2.5 h516909a_1 conda-forge zlib 1.2.11 h166bdaf_1014 conda-forge zstd 1.5.2 ha95c52a_0 conda-forge

I have run JAFFAL in the same session that I installed JAFFA (I was not running it as a job). The JAFFA installation was done as suggested in the tutorial: 1-I downloaded the JAFFA tar and the reference tar. decompressed them (the reference inside of the JAFFA folder) and run the install script.

Many thanks for your help! Eduardo

EduardoGCCM commented 2 years ago

Hi Nadia, As you mentioned process_transcriptome_align_table, I decided to re-install JAFFA and check the messages I get at the end. I got this:

process_transcriptome_align_table not found, fetching it make_3_gene_fusion_table not found, fetching it Checking that all required tools were installed: bpipe looks like it has been installed velveth looks like it has been installed velvetg looks like it has been installed oases looks like it has been installed trimmomatic looks like it has been installed samtools looks like it has been installed bowtie2 looks like it has been installed blat looks like it has been installed dedupe looks like it has been installed reformat looks like it has been installed extract_seq_from_fasta looks like it has been installed make_simple_read_table looks like it has been installed blastn looks like it has been installed minimap2 looks like it has been installed process_transcriptome_align_table looks like it has been installed make_3_gene_fusion_table looks like it has been installed


All commands installed successfully!

Could that be the origin of the problem?

Thanks! Eduardo

nadiadavidson commented 2 years ago

Thanks for this extra information. The message you get it okay, just means that it's installed process_transcriptome_align_table. Can you do a little test of me and send the output:

Run process_transcriptome_align_table: /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table You should get the error.

Then recompile this file: g++ -std=c++11 -O3 -o /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table /home/ssd/egomez/ontTAS/JAFFA-version-2.2/src/process_transcriptome_align_table.c++ Does it compile okay? Check the g++ version: g++ --version

Try running again to see if you still get the error: /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table

I wonder if conda doesn't have the right glibc++ library installed or your LD_LIBRARY_PATH is defaulting to the system version instead of conda's?

EduardoGCCM commented 2 years ago

Hi, Yes, I still get the same error. I paste the code and the outputs (as I am not sure if I should get any message after compiling!).

(jaffa) [egomez@vrtx-1 ontTAS]$ /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table: /lib64/libstdc++.so.6: version GLIBCXX_3.4.20' not found (required by /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table) /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table: /lib64/libstdc++.so.6: versionGLIBCXX_3.4.21' not found (required by /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table) (jaffa) [egomez@vrtx-1 ontTAS]$ g++ -std=c++11 -O3 -o /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table /home/ssd/egomez/ontTAS/JAFFA-version-2.2/src/process_transcriptome_align_table.c++ (jaffa) [egomez@vrtx-1 ontTAS]$ g++ --version g++ (GCC) 5.4.0 Copyright (C) 2015 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

(jaffa) [egomez@vrtx-1 ontTAS]$ /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table: /lib64/libstdc++.so.6: version GLIBCXX_3.4.20' not found (required by /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table) /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table: /lib64/libstdc++.so.6: versionGLIBCXX_3.4.21' not found (required by /home/ssd/egomez/ontTAS/JAFFA-version-2.2/tools/bin/process_transcriptome_align_table)

I was also thinking about issues regarding the conda vs system versions of glibc++. I will look a bit more into it.

EduardoGCCM commented 2 years ago

OK, loading a newer version of the system glibc++ (10.1.0) using module load solves the issue. Now I manage the run JAFFAL on the demo data without problems :)

Thanks!