RitchieLabIGH / IRFinder

MIT License
13 stars 10 forks source link

segmentation fault error #5

Closed lyj95618 closed 2 years ago

lyj95618 commented 2 years ago

Hello,

I am trying to run IRFinder and run into this issue:

IRFinderBAM: line 163: 3028 Segmentation fault (core dumped) ${LIBEXEC}/irfinder ${OUTPUTDIR} ${REF}/IRFinder/ref-cover.bed ${REF}/IRFinder/ref-sj.ref ${REF}/IRFinder/ref-read-continues.ref ${REF}/IRFinder/ref-ROI.bed ${READ_TYPE} "${AI_WARN}:${AI_INTRON}:${AI_RATIO}" "${JITTER}" $1 >> $OUTPUTDIR/logs/irfinder.stdout 2>> $OUTPUTDIR/logs/irfinder.stderr

I followed the wiki to download v2.0.tar.gz and installed it by adding the execution permissions. I am using gcc/7.2.0 and boost/1.71.0. I am running this on a cluster environment.

Thank you very much!

CloXD commented 2 years ago

Hello! Sorry for the late answer. To simplify the execution and the reproducibility, I suggest you to use the docker or singularity image. You're probably missing some dependencies ( listed here https://github.com/RitchieLabIGH/IRFinder/blob/main/Dockerfile ). Is it giving you any logs? is the file in the output directory logs/irfinder.stderr empty? What OS are you using? ( Linux or Mac, Linux under Windows... ) Cheers, Claudio

lyj95618 commented 2 years ago

Hello Claudio, Thanks for your reply. The irfinder.stderr is empty and the irfinder.stdout has the following output:

--------------------
|  IRFinder v. 2.0.0 |
 --------------------

---
IRFinder version: 2.0.0
IRFinder start:  Mon Feb 21 17:51:39 EST 2022
IRFinder runmode: BAM
IRFinder user@host: yliang @ node032
IRFinder working dir:  /mydir/genomes/hg37
IRFinder reference: /mydir/genomes/hg37/IRfinder_index
IRFinder file 1: /mydir/analysis/processed/test.bam
---
[  Mon Feb 21 17:51:39 EST 2022  ] Processing the BAM file with IRFinder
---
The given bam file is sorted by coordinate and is paired.
IRFinder run with options:
 - Output Dir:              /mydir/test_place/irfinder_test
 - Main intron ref.:        /mydir/genomes/hg37/IRfinder_index/IRFinder/ref-cover.bed
 - Splice junction ref.:    /mydir/genomes/hg37/IRfinder_index/IRFinder/ref-sj.ref
 - Read spans ref.:         /mydir/genomes/hg37/IRfinder_index/IRFinder/ref-read-continues.ref
 - Optional ROI ref.:       /mydir/hg37/IRfinder_index/IRFinder/ref-ROI.bed
 - Read type:               SR
 - AI levels:               1:1:0.05
 - Input BAM:               /mydir/analysis/processed/test.bam

Preparing the reference:
 - Junction count...done.
 - Span points...done.
 - Coverage blocks...done.
 - ROI...done

I am running this on a cluster so I can't use docker and unfortunately, I am not familiar with singularity either:( The OS is Centos7. The following are the modules that I loaded while running IRFinder. Am I still missing dependencies? cuz I am still getting Segmentation fault error

module load perl
module load python/3.8.0
module load curl/7.74.0
module load zlib/1.2.8
module load libxml2/2.9.1
module load star/2.6.1c
module load samtools
module load bedtools
module load boost/1.71.0
module load liblzma/5.2.2_alpha
module load R/3.5.1
module load gcc/7.2.0

Thank you so much for your help! Laur

CloXD commented 2 years ago

Hello Laur, I'm sorry you encountered such issue, the program seems to start without issues till the reading of the bam. Does is crash right after printing " - ROI...done" or after a while? In the second case, It might be an isssue with the sorted bam. Could you try sort it by name ( using samtools sort -n ... ) and use that file as input? In the first case, It's probably an issue related to the linking of libraries, for which you might need to compile it.

git clone https://github.com/RitchieLabIGH/IRFinder.git
cd ./IRFinder
chmod +x ./install.sh
./install.sh local

will install IRFinder in ~/.local/IRFinder

Let me know if this help. Cheers, Claudio

lyj95618 commented 2 years ago

Hello Claudio,

Thank you for your help. I decided to try the singularity image after doing some research about how to use it. I downloaded the image by running wget https://github.com/RitchieLabIGH/IRFinder/releases/download/v2.0/IRFinder and decided to re-run the build reference first by the following command:

singularity run IRFinder -m BuildRefFromSTARRef -l -r /mydir/genomes/hg37/IRfinder_index -x /mydir/genomes/hg37/star_index -f /mydir/genomes/hg37/hs37d5_spikein.fa -g /mydir/genomes/hg37/Homo_sapiens.GRCh37.87_spikein.gtf

now it is giving an error saying the following, but my star index is there (last time I ran the BuildRefFromSTARRef without using singularity image gave me no error.

/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
realpath: /mydir/genomes/hg37/star_index: No such file or directory

Thank you Laur

lyj95618 commented 2 years ago

Ok I think I know why. I should add the environmental variable SINGULARITY_BINDPATH. I will let you know if things go wrong again. Thanks : )

lyj95618 commented 2 years ago

With the singularity image, the tool is currently running fine. Thanks. I just have two more questions about the output.

  1. What would be the recommended filters and which columns to filter to get some high confidence intron retention events?
  2. if I am interested in full intron retention, I should look at the intron with IR ratio close to 1?

Thank you for your help! Laur

CloXD commented 2 years ago

Great, I'm glad you solved the problem.

  1. This is a tricky question. It partially depends on the sequencing depth of your RNA-seq: the higher is the depth, the more stringent your filter can be. By default, IRFinder raises a series of warnings ( https://github.com/RitchieLabIGH/IRFinder/wiki/IRFinder-Output ) in case of low coverage of the intron, low splicing etc... but those are fixed threshold ( for example the LowCoverage is raised only when there are less than 10 reads supporting the intron, spliced or not. The IRFinder-IR-[dir|nondir]-val.txt output contains IR events validated by the CNN model and are more likely to be real IR events. By default the introns raising warnings are excluded, but you can include some or all of them using the parameter -w
    
    -R double : Minimum IRratio accepted to consider the intron for the CNN validation. Default: 0.05 
    -w int : Warning level accepted to consider the intron for the CNN validation. Default: 1
         0: Disabled 
         1: Only without warning 
         2: Include NonUniformIntronCover  
         3: Include also MinorIsoform
         4: Include also LowSplicing
         5: Include also LowCover ( consider all )

2. Yes, but if they are not included in the CNN validated list, we suggest to visually indagate a few of them ( or all if they are not too many ). 

I hope this was helpful.
Cheers,
Claudio