williamritchie / IRFinder

Detecting intron retention from RNA-Seq experiments
53 stars 25 forks source link

IRFinder-1.0.0/bin/IRFinder: line 547: 57420 Broken pipe #6

Closed yuxinghai closed 7 years ago

yuxinghai commented 7 years ago

I use irfinder to detect IR event, but when I run as the example1 showed: IRFinder -d ${name}_new -a off -s LoadAndKeep -r /data2/zhoulab/yuxinghai/software/IRFinder-1.0.0/REF/Human-hg38-release84 ${fq1} ${fq2} it cause an error . STAR version : STAR_2.5.0a /software/IRFinder-1.0.0/bin/IRFinder: line 547: 57420 Broken pipe "$STAREXEC" --genomeLoad $STARMEMORYMODE --runThreadN $THREADS --genomeDir "$REF/STAR" --outFilterMultimapNmax 1 --outSAMstrandField intronMotif --outFileNamePrefix "${OUTPUTDIR}/" --outSAMunmapped None --outSAMmode NoQS --outSAMtype BAM Unsorted --outStd BAM_Unsorted --readFilesIn "$1" "$2" $EXTRAREADFILESCOMMAND 57421 | tee "$OUTPUTDIR/Unsorted.bam" 57422 | gzip -cd 57423 Aborted (core dumped) | "$LIBEXEC/irfinder" "$OUTPUTDIR" "$REF/IRFinder/ref-cover.bed" "$REF/IRFinder/ref-sj.ref" "$REF/IRFinder/ref-read-continues.ref" "$REF/IRFinder/ref-ROI.bed" "$OUTPUTDIR/unsorted.frag.bam" >> "$OUTPUTDIR/irfinder.stdout" 2>> "$OUTPUTDIR/irfinder.stderr" ERROR: IRFinder appears not to have completed. It appears an unknown component crashed.

some one can help me ?

andpet0101 commented 7 years ago

Hi,

I had the same error and found that in the reference directory the file ref-cover.bed and other files were empty. I could track this issue down to a bug in the script Mapability. This script generates pseudoreads from the genome, remaps them and then identifies regions where the reads did not map (=regions with low mapability). In line 51, this code

time ls "$TMPCHR"/*.bed."$TMPEXT" | xargs --max-args 1 --max-procs "$THREADS" -I{} bash -c "\"$TMPCMP\" -cd < {} | bedtools genomecov -i stdin -bga -g \"$CHRLEN\"  | awk 'BEGIN {FS=\"\t\"; OFS=\"\t\"} (\$4 < 5) {print \$1, \$2, \$3}' | bedtools merge -i stdin > {}.exclusion"

is supposed to summarise for each chromosome (the ls) per base the number of reads mapped to create regions with low mapping rates. The problem is that the bedtools genomecov does that not only for the particular chromosome but always for all other chromosomes too, thus creating large regions with zero coverage (since input is only the data for the particular chromosome). Fix the line as follows:

## added  |  awk 'NR==1{chr=\$1;print}\$1==chr{print}' |
time ls "$TMPCHR"/*.bed."$TMPEXT" | xargs --max-args 1 --max-procs "$THREADS" -I{} bash -c "\"$TMPCMP\" -cd < {} | bedtools genomecov -i stdin -bga -g \"$CHRLEN\"  |  awk 'NR==1{chr=\$1;print}\$1==chr{print}' | awk 'BEGIN {FS=\"\t\"; OFS=\"\t\"} (\$4 < 5) {print \$1, \$2, \$3}' | bedtools merge -i stdin > {}.exclusion"

Then rebuild the entire reference directory and the file ref-cover.bed should contain data and IRFinder should work.

Andreas

dg520 commented 7 years ago

Hi all,

Sorry for a late reply, we just released a new version of IRFinder 1.2.0. It has several major fixation and upgrades including solving this "bedtool genomecov" problem. Thanks a lot for @andpet0101 's suggestion and contribution!

Best, Dadi

xiaoyonf commented 5 years ago

Hi, I encountered the same error as this post. And I do check and found that my reference directory the file ref-cover.bed and other files were empty. But my IRFinder is the latest version 1.2.5. Please advise how to solve the same problem. Thanks!