williamritchie / IRFinder

Detecting intron retention from RNA-Seq experiments
53 stars 25 forks source link

Error in reference building using IRfinder #100

Closed ucbtsh6 closed 4 years ago

ucbtsh6 commented 4 years ago

Im using this code for reference building in IRfinder bin/IRFinder -m BuildRefProcess -r REF/tt -f REF/tt/genome.fa -g REF/tt/transcripts.gtf where genome.fa is the softmasked reference fast, and transcripts.gtf is the gtf file of the reference genome I got error readlink: illegal option -- f usage: readlink [-n] [file ...] bin/IRFinder: line 254: /proc/meminfo: No such file or directory bin/IRFinder: line 255: /1000: syntax error: operand expected (error token is "/1000") bin/IRFinder: line 274: [: -lt: unary operator expected bin/IRFinder: line 289: /proc/cpuinfo: No such file or directory grep: /proc/cpuinfo: No such file or directory Launching reference build process. The full build might take hours. bin/IRFinder: line 338: ./util/IRFinder-BuildRefFromEnsembl: No such file or directory (base) ITs-MacBook-Pro:IRFinder-1.3.0 Shaimaa$ gcc --version Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/c++/4.2.1 Apple LLVM version 10.0.1 (clang-1001.0.46.4) Target: x86_64-apple-darwin18.7.0 Thread model: posix InstalledDir: /Library/Developer/CommandLineTools/usr/bin

dg520 commented 4 years ago

Hi @ucbtsh6 ,

Are you using the Mac OS? We cannot guarantee the compatibility on MAC, as said in the manual. In your specific case: 1) IRFinder cannot get your CPU and memory information from /proc folder; 2) Your readlink library is either not up-to-date or not a typical Linux version; 3) Your GCC is too old. 4) We are highly against running IRFinder on a personal computer. IRFinder requires at least 8 physical cores and 48GB memory to work efficiently. Otherwise, you job might be endless and it might block you from using your computer while IRFinder is running, because IRFinder occupies too many resources.

If your system does have enough resources and you really want to run IRFinder under MAC, read the following. But please be aware: 1) You are on your own risk to potentially damage your software and/or hardware, including but not limited to blocking the OS, losing data, damage other software, etc.; 2) I really hope you understand each step and its consequence before trying. 3) I don't guarantee it'll work at the end.

For Issue 1: You need to hack the source code in bin/IRFinder a bit. You need to: Change line 253 from

MEMK=`awk '($1 ~ /^MemTotal:/) {print $2}' < /proc/meminfo`

to

MEMK= XXXX

XXXX here has to be a number of your system memory size, in KB unit.

For Issue 2: You need to work an alternative version of readlink specifically for MAC via the instruction here.

For Issue 3: You can try to use Anaconda to install a newer GCC (>=4.9.0). After installing, call gcc --version to ensure it's the default GCC and the version is correct.

Then you need to run IRFinder with -t option, which tells IRFinder how many CPU cores you want to use, because IRFinder cannot measure the number of available CPU on MAC. The number you give cannot exceed the actual number of CPU you have on your machine. In your example code, now it should be:

bin/IRFinder -t 2 -m BuildRefProcess -r REF/tt -f REF/tt/genome.fa -g REF/tt/transcripts.gtf

which will use 2 CPU cores.

Please let me emphasize again: You're on your own risk when applying these changes! And you might encounter other problems in IRFinder after fixing the above, as I'm not familiar with MAC and cannot estimate how many total incompatibilities exist.

Best, Dadi