williamritchie / IRFinder

Detecting intron retention from RNA-Seq experiments
53 stars 25 forks source link

Reference not finishing the build #124

Open aleighbrown opened 4 years ago

aleighbrown commented 4 years ago

Hello,

I am building the reference for IRFinder using a SGE cluster, submitting with the following params

#$ -l tmem=50G
#$ -l h_vmem=50G
#$ -l h_rt=92:00:00

#$ -j y
#$ -N irbuild

source ~/.bash_profile

cd /home/annbrown/tools/IRFinder
bin/IRFinder -m BuildRef -r REF/Human-GRCh38-release100_retest -e REF/extra-input-files/RNA.SpikeIn.ERCC.fasta.gz -b REF/extra-input-files/Human_hg38_nonPolyA_ROI.bed ftp://ftp.ensembl.org/pub/release-100/gtf/homo_sapiens/Homo_sapiens.GRCh38.100.gtf.gz

The build keeps ending at Reference prepartion step

tail irbuild.o2519138  -n30
Nov 03 11:39:16 ... finished generating suffix array
Nov 03 11:39:16 ... generating Suffix Array index
Nov 03 11:52:37 ... completed Suffix Array index
Nov 03 11:52:37 ..... processing annotations GTF
Nov 03 11:53:29 ..... inserting junctions into the genome indices
Nov 03 12:05:31 ... writing Genome to disk ...
Nov 03 12:05:44 ... writing Suffix Array to disk ...
Nov 03 12:07:43 ... writing SAindex to disk
Nov 03 12:07:59 ..... finished successfully
<Phase 2: Mapability Calculation>
Nov 03 12:08:05 ... mapping genome fragments back to genome...
Nov 03 12:49:01 ... sorting aligned genome fragments...
[bam_sort_core] merging from 48 files and 48 in-memory blocks...
Nov 03 13:02:10 ... indexing aligned genome fragments...
Nov 03 13:03:52 ... filtering aligned genome fragments by chromosome/scaffold...
Nov 03 13:11:42 ... merging filtered genome fragments...
Nov 03 13:13:07 ... calculating regions for exclusion...
Nov 03 13:20:25 ... cleaning temporary files...
<Phase 3: IRFinder Reference Preparation>
Nov 03 13:20:40 ... building Ref 1...
Nov 03 13:21:33 ... building Ref 2...
Nov 03 13:21:37 ... building Ref 3...
Nov 03 13:21:37 ... building Ref 4...
Nov 03 13:22:29 ... building Ref 5...
Nov 03 13:23:30 ... building Ref 6...
Nov 03 13:23:31 ... building Ref 7...
Nov 03 13:23:35 ... building Ref 8...
Nov 03 13:23:37 ... building Ref 9...
Nov 03 13:23:37 ... building Ref 10c...
Nov 03 13:23:37 ... building Ref 11c...

And trying to run gives the following error

/SAN/vyplab/alb_projects/tools/IRFinder/bin/IRFinder -m BAM -r /SAN/vyplab/alb_projects/tools/IRFinder/REF/Human-GRCh38-release100_retest/ -d test KD.Aligned.unsorted.out.bam
ERROR: IRFinder appears not to have completed. It appears an unknown component crashed.
ERROR: IRFinder appears not to have completed. It appears an unknown component crashed.
ERROR: IRFinder appears not to have completed. It appears an unknown component crashed.

The test folder has the following info

 cat test/irfinder.stderr 
/SAN/vyplab/alb_projects/tools/IRFinder/bin/util/irfinder: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /SAN/vyplab/alb_projects/tools/IRFinder/bin/util/irfinder)
dg520 commented 4 years ago

@aleighbrown , The reference has been built successfully. Your problem came from the incompatibility of your GLIBC version. First check if your GCC is >=4.90 so it supports C++11 features. If so, follow this instruction to re-compile the IRFinder core. When complete, re-run

/SAN/vyplab/alb_projects/tools/IRFinder/bin/IRFinder -m BAM -r /SAN/vyplab/alb_projects/tools/IRFinder/REF/Human-GRCh38-release100_retest/ -d test KD.Aligned.unsorted.out.bam
aleighbrown commented 4 years ago

Hi there,

On our cluster the gcc in the /usr/bin is quite old

/usr/bin/gcc --version gcc (GCC) 4.8.5 20150623

So typically to compile newer code I change the one on the path to

gcc --version gcc (GCC) 6.3.1 20170216 (Red Hat 6.3.1-3)

Trying to recompile with this as my gcc gives

[annbrown@pryor irfinder]$make clean
rm -f *.o irfinder Depend.list
[annbrown@pryor irfinder]$ make
Makefile:32: Depend.list: No such file or directory
/bin/rm -f ./Depend.list
g++ -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:23:11 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   -MM ReadBlockProcessor.cpp FragmentBlocks.cpp IRFinder.cpp crc32.cpp ReadBlockProcessor_OutputBAM.cpp CoverageBlock.cpp ReadBlockProcessor_CoverageBlocks.cpp BAM2blocks.cpp >> Depend.list
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:23:18 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   FragmentBlocks.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:23:18 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   ReadBlockProcessor.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:23:18 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   CoverageBlock.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:23:18 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   ReadBlockProcessor_CoverageBlocks.cpp
ReadBlockProcessor_CoverageBlocks.cpp: In member function ‘double CoverageBlocks::percentileFromHist(const std::map<unsigned int, unsigned int>&, uint) const’:
ReadBlockProcessor_CoverageBlocks.cpp:234:9: error: ‘NAN’ was not declared in this scope
  return NAN;
         ^~~
aleighbrown commented 4 years ago

Ok Adding include <cmath> on the top of includedefine.h of IRFinder source code and re-compile lead to to a different error

make
Makefile:32: Depend.list: No such file or directory
/bin/rm -f ./Depend.list
g++ -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:30:09 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   -MM ReadBlockProcessor.cpp FragmentBlocks.cpp IRFinder.cpp crc32.cpp ReadBlockProcessor_OutputBAM.cpp CoverageBlock.cpp ReadBlockProcessor_CoverageBlocks.cpp BAM2blocks.cpp >> Depend.list
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:30:14 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   FragmentBlocks.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:30:14 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   ReadBlockProcessor.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:30:14 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   CoverageBlock.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:30:14 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   ReadBlockProcessor_CoverageBlocks.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:30:14 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   ReadBlockProcessor_OutputBAM.cpp
ReadBlockProcessor_OutputBAM.cpp: In member function ‘virtual void OutputBAM::ChrMapUpdate(const std::vector<std::basic_string<char> >&)’:
ReadBlockProcessor_OutputBAM.cpp:184:57: warning: unused parameter ‘chrmap’ [-Wunused-parameter]
 void OutputBAM::ChrMapUpdate(const std::vector<string> &chrmap) {
                                                         ^~~~~~
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:30:14 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   BAM2blocks.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:30:14 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   IRFinder.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:30:14 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   crc32.cpp
g++ -o irfinder -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue 3 Nov 16:30:14 GMT 2020 pryor.cs.ucl.ac.uk:/home/annbrown/tools/IRFinder/src/irfinder"'   FragmentBlocks.o ReadBlockProcessor.o CoverageBlock.o ReadBlockProcessor_CoverageBlocks.o ReadBlockProcessor_OutputBAM.o BAM2blocks.o IRFinder.o crc32.o -static -static-libgcc
/opt/rh/devtoolset-6/root/usr/libexec/gcc/x86_64-redhat-linux/6.3.1/ld: cannot find -lm
/opt/rh/devtoolset-6/root/usr/libexec/gcc/x86_64-redhat-linux/6.3.1/ld: cannot find -ldl
/opt/rh/devtoolset-6/root/usr/libexec/gcc/x86_64-redhat-linux/6.3.1/ld: cannot find -lpthread
/opt/rh/devtoolset-6/root/usr/libexec/gcc/x86_64-redhat-linux/6.3.1/ld: cannot find -lc
collect2: error: ld returned 1 exit status
Makefile:41: recipe for target 'IRFinderStatic' failed
make: *** [IRFinderStatic] Error 1
dg520 commented 4 years ago

@aleighbrown You have multiple libraries (e.g. libm, libdl, libpthread and libc) unlinked with you GCC 6.3.1. The instruction provides you a workaround to deal with the unlinked libm problem but not others. You have to contact your System Admin to help you compile IRFinder under GCC 6.3.1. I don't know the other library paths from my end.

Alternatively, you can try to load some other versions of GCC, for example, GCC 4.9.0, or 5.3.0 if either exists (Any version >=4.9.0 is fine). Some of those GCCs might be completely configured by your System Admin, so they might be capable of compiling IRFinder core for you. You might still need the workaround for unlinked libm (e.g. -lm error) though.

me37uday commented 3 years ago

Reference building stops at the exact same point in my case as well.

Got the exact same errors when trying to recompile.

I downloaded the missing static libraries as mentioned above using conda and made the necessary changes in the make file. On re-compiling, I encountered new errors such as : cannot find lgomp and lstdc++

Is there no alternate fix for this other than seeking help from cluster admin? That is something I want to use as last resort. Please do let me know :)

Cheers, Uday

dg520 commented 3 years ago

@me37uday which GCC and GLIBC versions are you using? It's very complicated to take the workaround path as the problem sits in the kernel.

me37uday commented 3 years ago
(base) [urangasw@login1 urangasw]$ which gcc
/opt/ohpc/pub/compiler/gcc/8.3.0/bin/gcc
(base) [urangasw@login1 urangasw]$ ldd --version
ldd (GNU libc) 2.17
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

Hope this helps rectify the problem!

dg520 commented 3 years ago

@me37uday Thanks. And could you please show me the exact error message as well? And you mentioned that "I downloaded the missing static libraries as mentioned above using conda". Which static library do you refer to? I saw your gcc seems to your system default one. It will not realize the static library installed by conda. We can try to look into that further, although I strongly recommend you ask your system admin for help. Different Linux machines could be set up quite differently and only the system admin knows them best.

me37uday commented 3 years ago

I wasn't sure of the exact locations of libm.a libc.a and other libraries on the cluster, hence I downloaded them using conda install -c asmeurer glibc command and added it's path in the make file. On trying to recompile it, I got new errors saying, cannot find lgomp and lstdc++.

Also the default version of gcc in the cluster is 4.8.5 ish. I used module load intel for the gcc upgrade.

Thoughts and suggestions please. Thanks for such quick response :)

me37uday commented 3 years ago
(irfinder) [urangasw@login1 irfinder]$ make
Makefile:32: Depend.list: No such file or directory
/bin/rm -f ./Depend.list
g++ -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Fri Apr 2 00:11:30 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   -MM ReadBlockProcessor.cpp FragmentBlocks.cpp IRFinder.cpp crc32.cpp ReadBlockProcessor_OutputBAM.cpp CoverageBlock.cpp ReadBlockProcessor_CoverageBlocks.cpp BAM2blocks.cpp >> Depend.list
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Fri Apr 2 00:11:49 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   FragmentBlocks.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Fri Apr 2 00:11:49 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   ReadBlockProcessor.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Fri Apr 2 00:11:49 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   CoverageBlock.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Fri Apr 2 00:11:49 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   ReadBlockProcessor_CoverageBlocks.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Fri Apr 2 00:11:49 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   ReadBlockProcessor_OutputBAM.cpp
ReadBlockProcessor_OutputBAM.cpp: In member function _virtual void OutputBAM::ChrMapUpdate(const std::vector<std::__cxx11::basic_string<char> >&)_:
ReadBlockProcessor_OutputBAM.cpp:184:57: warning: unused parameter _chrmap_ [-Wunused-parameter]
 void OutputBAM::ChrMapUpdate(const std::vector<string> &chrmap) {
                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Fri Apr 2 00:11:49 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   BAM2blocks.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Fri Apr 2 00:11:49 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   IRFinder.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Fri Apr 2 00:11:49 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   crc32.cpp
g++ -o irfinder -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Fri Apr 2 00:11:49 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   FragmentBlocks.o ReadBlockProcessor.o CoverageBlock.o ReadBlockProcessor_CoverageBlocks.o ReadBlockProcessor_OutputBAM.o BAM2blocks.o IRFinder.o crc32.o -static -static-libgcc -L/home/urangasw/miniconda3/envs/irfinder/lib
/usr/bin/ld: cannot find -lstdc++
/usr/bin/ld: cannot find -lgomp
collect2: error: ld returned 1 exit status
make: *** [IRFinderStatic] Error 1
me37uday commented 3 years ago

Also the compilation part is mentioned as optional in the user manual. Which makes me wonder which gcc version is the tool built for as default?

dg520 commented 3 years ago

@me37uday OK, Several problems:

  1. Your anaconda installation of those packages is not parsed to your gcc that is module loaded. So your gcc 8,3.0 will only use those libraries it knows the locations when it was set up.
  2. During the set up of your gcc 8.3.0 by the system admin, it's not linked with some fundamental libraries. This can happen on a complex cluster, where some gcc versions (e.g. usually those most recent ones) are conditionally set to keep the compatibility with the rest of the system.

Now several suggestions here, but no guarantees:

  1. Forget the conda path for now.
  2. Restore the path in the Make file. Or re-download the Github version to overwrite your current Make file
  3. Can you module load a lower version of gcc? I would suggest you load the oldest version of GCC that is >=4.9.0.
  4. Re-compile. And what is the exact error message now? That error might be much easier to fix.
  5. If Step 4 failed, let me know your OS version, your gcc used in Step 4 and the error message.
me37uday commented 3 years ago

The irfinder I am trying to work with is downloaded from the github.

step 4 failed.

g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Fri Apr 2 00:40:45 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   BAM2blocks.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Fri Apr 2 00:40:45 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   IRFinder.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Fri Apr 2 00:40:45 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   crc32.cpp
g++ -o irfinder -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Fri Apr 2 00:40:45 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   FragmentBlocks.o ReadBlockProcessor.o CoverageBlock.o ReadBlockProcessor_CoverageBlocks.o ReadBlockProcessor_OutputBAM.o BAM2blocks.o IRFinder.o crc32.o -static -static-libgcc
/opt2/rh/devtoolset-6/root/usr/bin/../libexec/gcc/x86_64-redhat-linux/6.3.1/ld: cannot find -lm
/opt2/rh/devtoolset-6/root/usr/bin/../libexec/gcc/x86_64-redhat-linux/6.3.1/ld: cannot find -ldl
/opt2/rh/devtoolset-6/root/usr/bin/../libexec/gcc/x86_64-redhat-linux/6.3.1/ld: cannot find -lpthread
/opt2/rh/devtoolset-6/root/usr/bin/../libexec/gcc/x86_64-redhat-linux/6.3.1/ld: cannot find -lc
collect2: error: ld returned 1 exit status
Makefile:41: recipe for target 'IRFinderStatic' failed
make: *** [IRFinderStatic] Error 1
(base) [urangasw@login1 irfinder]$ which gcc
/opt/rh/devtoolset-6/root/usr/bin/gcc
(base) [urangasw@login1 irfinder]$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt2/rh/devtoolset-6/root/usr/bin/../libexec/gcc/x86_64-redhat-linux/6.3.1/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/opt/rh/devtoolset-6/root/usr --mandir=/opt/rh/devtoolset-6/root/usr/share/man --infodir=/opt/rh/devtoolset-6/root/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --enable-plugin --with-linker-hash-style=gnu --enable-initfini-array --disable-libgcj --with-default-libstdcxx-abi=gcc4-compatible --with-isl=/builddir/build/BUILD/gcc-6.3.1-20170216/obj-x86_64-redhat-linux/isl-install --enable-libmpx --enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 6.3.1 20170216 (Red Hat 6.3.1-3) (GCC) 
(base) [urangasw@login1 irfinder]$ cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

(base) [urangasw@login1 irfinder]$ cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)
dg520 commented 3 years ago

@me37uday Please check if you can find libm.a or libm.so at /usr/lib/x86_64-redhat-linux5E/lib64/, as your OS is redhat. If not, try to do

sudo find / -type f -iname "libm.*"

You should use the same method to figure out the FOLDERs of libdl.*, libpthread.* and libc.*.

Then open the Makefile and expand the line of

LDFLAGS_static := -static -static-libgcc

with -L tags.

For example, if your libm.* is at /usr/lib/x86_64-redhat-linux5E/lib64/ while the other three libraries are at /usr/lib/, you should change that line to

LDFLAGS_static := -static -static-libgcc -L/usr/lib/x86_64-redhat-linux5E/lib64/ -L/usr/lib/

This should solve the re-compiling issue, if the libraries you found above are compatible with your GCC 6.3.1. The original IRFinder was compiled under GCC 4.9.0. But the later GCC versions should be back-compatible. For example, I've successfully compiled it on a GCC 9.3.0 on my home machine. The real problem is not the version of GCC, it is whether the versions of those libraries at the system-default locations match the version of GCC. That's why the system admin knows that better than us.

me37uday commented 3 years ago

I did manage to find the location of a couple of those libraries but for the rest I guess I have to get in touch with the admin. Will keep you updated of how it goes.

Thank you so much for your time and suggestions :)

Cheers, Uday

dg520 commented 3 years ago

@me37uday No problem.

me37uday commented 3 years ago

Hi dg520,

g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue Apr 6 14:40:33 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   BAM2blocks.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue Apr 6 14:40:33 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   IRFinder.cpp
g++ -c -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue Apr 6 14:40:33 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   crc32.cpp
g++ -o irfinder -pipe -std=c++0x -O3    -Wall -Wextra -fopenmp -D'COMPILATION_TIME_PLACE="Tue Apr 6 14:40:33 CEST 2021 login1.mgmt:/home/urangasw/Softwares/IRFinder-1.3.0/src/irfinder"'   FragmentBlocks.o ReadBlockProcessor.o CoverageBlock.o ReadBlockProcessor_CoverageBlocks.o ReadBlockProcessor_OutputBAM.o BAM2blocks.o IRFinder.o crc32.o -static -static-libgcc -L/lib64/
/opt2/rh/devtoolset-6/root/usr/bin/../libexec/gcc/x86_64-redhat-linux/6.3.1/ld: cannot find -lm
/opt2/rh/devtoolset-6/root/usr/bin/../libexec/gcc/x86_64-redhat-linux/6.3.1/ld: cannot find -ldl
/opt2/rh/devtoolset-6/root/usr/bin/../libexec/gcc/x86_64-redhat-linux/6.3.1/ld: cannot find -lpthread
/opt2/rh/devtoolset-6/root/usr/bin/../libexec/gcc/x86_64-redhat-linux/6.3.1/ld: cannot find -lc
collect2: error: ld returned 1 exit status
Makefile:41: recipe for target 'IRFinderStatic' failed
make: *** [IRFinderStatic] Error 1
(base) [urangasw@login1 irfinder]$ ldconfig -p | grep libc.so 
        libc.so.6 (libc6,x86-64, OS ABI: Linux 2.6.32) => /lib64/libc.so.6
(base) [urangasw@login1 irfinder]$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt2/rh/devtoolset-6/root/usr/bin/../libexec/gcc/x86_64-redhat-linux/6.3.1/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/opt/rh/devtoolset-6/root/usr --mandir=/opt/rh/devtoolset-6/root/usr/share/man --infodir=/opt/rh/devtoolset-6/root/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --enable-plugin --with-linker-hash-style=gnu --enable-initfini-array --disable-libgcj --with-default-libstdcxx-abi=gcc4-compatible --with-isl=/builddir/build/BUILD/gcc-6.3.1-20170216/obj-x86_64-redhat-linux/isl-install --enable-libmpx --enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 6.3.1 20170216 (Red Hat 6.3.1-3) (GCC) 

The admin helped me with knowing the path of the required libraries and it is all contained in /lib64/ folder. And as you can see, though I am using gcc version 6 and I have mentioned the path of the libraries in the Makefile, I'm still getting the same errors. Thoughts and suggestion please!!

dg520 commented 3 years ago

@me37uday What are the paths to these four libraries? And how did you set the Make file line? Could you please paste them here?

me37uday commented 3 years ago
(base) [urangasw@login1 ~]$ ldconfig -p | grep libm.so                                               │  GNU nano 2.3.1                     File: Makefile                                                  
        libm.so.6 (libc6,x86-64, OS ABI: Linux 2.6.32) => /lib64/libm.so.6                           │
        libm.so (libc6,x86-64, OS ABI: Linux 2.6.32) => /lib64/libm.so                               │OBJECTS := FragmentBlocks.o ReadBlockProcessor.o CoverageBlock.o ReadBlockProcessor_CoverageBlocks.o$
(base) [urangasw@login1 ~]$ ldconfig -p | grep libdl.so                                              │SOURCES=$(wildcard *.cpp)
        libdl.so.2 (libc6,x86-64, OS ABI: Linux 2.6.32) => /lib64/libdl.so.2                         │LDFLAGS :=
        libdl.so (libc6,x86-64, OS ABI: Linux 2.6.32) => /lib64/libdl.so   
(base) [urangasw@login1 ~]$ ldconfig -p | grep libpthread.so                                         │OPTIMFLAGS :=
        libpthread.so.0 (libc6,x86-64, OS ABI: Linux 2.6.32) => /lib64/libpthread.so.0               │OPTIMFLAGS1 :=
(base) [urangasw@login1 ~]$ ldconfig -p | grep libc.so                                               │# below flags make little difference.
        libc.so.6 (libc6,x86-64, OS ABI: Linux 2.6.32) => /lib64/libc.so.6

In the Makefile :

LDFLAGS_static := -static -static-libgcc -L/lib64/

dg520 commented 3 years ago

@me37uday Seems your gcc was configured with shared libraries. Could you please try to change to the following Make file line?

LDFLAGS_static := -static-libgcc -L/lib64/

Does the recompilation work now?

me37uday commented 3 years ago

Finally!! Yes, it worked :D

Thanks @dg520.

You could probably update a few things in the installation page, specifically this :

$ ldconfig -p | grep (name of the library) to find the path of the libraries so that it could be updated in the Makefile.

Also, I had found another way of making it work but it very absurd. I'll check if the results using this current version match with it to know for sure if the other version worked right.

Thanks again for everything @dg520. Have a nice day!!

dg520 commented 3 years ago

@me37uday That's great. Thanks for bearing with me. I hope IRFinder will be helpful for your research work. P.S. ldconfig might not be available on some Linux machines. But really appreciate your suggestion.

me37uday commented 3 years ago

My happiness was short-lived.

Yes, I copied the new irfinder executable file to bin/util. In spite of successful compilation, while building the reference it does not finish completely.

Any ideas why this might be happening?

dg520 commented 3 years ago

@me37uday What is the error message?

me37uday commented 3 years ago
Launching reference build process. The full build might take hours.
<Phase 1: STAR Reference Preparation>
Apr 06 17:10:30 ... copying the genome FASTA file...
Apr 06 17:10:39 ... copying the transcriptome GTF file...
Apr 06 17:10:44 ... copying the STAR reference folder...
<Phase 2: Mapability Calculation>
Apr 06 17:12:06 ... mapping genome fragments back to genome...
Apr 06 17:53:36 ... sorting aligned genome fragments...
[bam_sort_core] merging from 60 files and 20 in-memory blocks...
Apr 06 18:03:22 ... indexing aligned genome fragments...
Apr 06 18:04:25 ... filtering aligned genome fragments by chromosome/scaffold...
Apr 06 18:09:49 ... merging filtered genome fragments...
Apr 06 18:11:30 ... calculating regions for exclusion...
Apr 06 18:18:57 ... cleaning temporary files...
<Phase 3: IRFinder Reference Preparation>
Apr 06 18:19:03 ... building Ref 1...
Apr 06 18:19:47 ... building Ref 2...
Apr 06 18:19:51 ... building Ref 3...
Apr 06 18:19:51 ... building Ref 4...
Apr 06 18:20:32 ... building Ref 5...
Apr 06 18:21:17 ... building Ref 6...
Apr 06 18:21:17 ... building Ref 7...
Apr 06 18:21:20 ... building Ref 8...
Apr 06 18:21:22 ... building Ref 9...
Apr 06 18:21:22 ... building Ref 10c...
Apr 06 18:21:22 ... building Ref 11c...

However, the necessary files seems to be generated. But when I go on to quantify IR, I get an empty output. Just the output directory is created. No error in the out file as well.

dg520 commented 3 years ago

@me37uday The reference building is successful. Please note, the reference building process is totally irrelevant to your compiling issue. It doesn't need irfinder core at all. And your current problem is still in irfinder core I believe. If you haven't run make clean before compiling, please go to the src/irfinder folder and run the following:

make clean
make
cp irfinder ../../bin/util/

And then run the quantification step again (You don't need to prepare the reference again). If it still failed and the output folder was totally empty, I would say there might be incompatibility between the compiler and the C++ libraries used in irfinder. If there were some files in the output folder, let me know what are they.

me37uday commented 3 years ago

Output folder was totally empty. Maybe incompatibility between the compiler and the C++ libraries used in irfinder as you say.

Okay. So here's a way a colleague of mine has cracked it and it works for me as well as him :

  1. We download irfinder from github on our local PC and compile it as mentioned in the manual.
  2. Here's the fun part - in the cluster, we create a conda environment, activate it and we install all the necessary tools using conda commands ie; samtools, bedtools and STAR (make sure you install the required versions and google the conda commands for installation) and gcc 6 using the following command : conda install -c omgarcia gcc-6
  3. Just copy/move the entire irfinder folder from you local machine to your cluster environment. Set ~/.bashrc file if you want to.
  4. Voila, it works!

What surprises me is, my PC has a gcc version of 9 when I compile it and when I run it on the cluster (the newly created conda environment) the gcc version there is 6 and somehow they are compatible!! I get all the 8 output files as expected (my data is unstranded and I get IRFinder-IR-nondir.txt file).

@dg520 what do you have to say about this?!

dg520 commented 3 years ago

@me37uday That's a heroic trial for you and your colleague. Glad to hear the workaround. That path totally makes sense for me. In the conda virtual environment, the conda core makes sure everything compatible to each other. In that way, the conda-gcc should be a fully configured one. All the fully configured gcc after 4.9.0 should compile a working irfinder, as we only employ C++ 11 futures there introduced in 4.9.0. GCC is always back-compatible.

Your server-side GCC 6 might be partially configured, which is totally possible because some libraries cannot work with very old hardware.

me37uday commented 3 years ago

@dg520 Yup, it's working.

I was curious about couple of things, so in the downstream analysis where you say and I quote,

# This tests if the number of IR reads are significantly different from normal spliced reads, in the KO samples.
# We might only be interested in the "log2FoldChange" column, instead of the significance.
# This is because "log2FoldChange" represents log2(number of intronic reads/number of normal spliced reads).
# So we the value of (intronic reads/normal spliced reads) by

Which column(s) or sum of columns in the output file represent the counts of the number of intronic reads and normal spliced reads?

Should I have started a new question?

dg520 commented 3 years ago

@me37uday normal spliced reads is at Column 19. And note: we don't record real number of intronic reads in the output file. Instead, we record median sequencing depth inside each intron, at Column 9, serving for a similar purpose. If you see things called "intronic reads" in the manual, please be aware that it is not a precise definition and should refer to Column 9 in the output file.

Please refer here for more details about the output file.

Yes, please start a new thread if you have following-up questions.

iBiology commented 3 years ago

I also have issue with reference build (hg19 with pre-existing local STAR index). I submitted the reference build job to a cluster with 32 cpus and 64G memory with the following command:

IRFinder -m BuildRefFromSTARRef -r /path/to/irfinder/genomes/hg19 -x /path/to/star.2.7.9a/genomes/hg19 -t 32

IRFinder does not throw any error, but it seems stucking at the mappability calculation stage for hours without any progress:

Launching reference build process. The full build might take hours. <Phase 1: STAR Reference Preparation> Aug 06 14:18:10 ... copying the genome FASTA file... Aug 06 14:18:18 ... copying the transcriptome GTF file... Aug 06 14:18:21 ... copying the STAR reference folder... <Phase 2: Mapability Calculation> Aug 06 14:19:40 ... mapping genome fragments back to genome...

The IRFinder is 1.3.1 and STAR is 2.7.9a. After 12 hours later, it has nothing new added to the log. Not sure how long it usually takes for building human reference (hg19). STAR Log.progress.out has nothing but the title line was written, Log.out has all 31 threads created and then nothing else logged out. Inside Mappability directory, there is a tmp_6420 directory and _STARtmp temp directory, but both of them are empty. The genome_fragments.sam only has header lines written but nothing else.

Any thought of what happed here? Does the package has any pre-build hg19 reference that can be directly downloaded and use without building by every user?

dg520 commented 3 years ago

@iBiology This is a different problem than the one originally described in this thread. Your problem is due to STAR instead of IRFinder itself (i.e. it's the STAR alignment step that stuck you). I recommend you to consult STAR forum, although I could provide you some options:

  1. You can try BuildRef mode with the FTP link to hg19 here: (ftp://ftp.ensembl.org/pub/release-75/gtf/homo_sapiens/Homo_sapiens.GRCh37.75.gtf.gz)
  2. Try another version of STAR if you can. IRFinder has been tested on STAR 2.5.2
  3. Try to trim your local FASTA file to only contain main chromosomes when using BuildRefProcess mode.
  4. Try the combination of 2 and 3.

Please open a new thread if you have following questions, Thanks.