Open rcyyjiang opened 5 years ago
Hi Mei,
You shouldn't kill generateReadsError.pl
which is a core step during reference building (the mapability estimation), although I'm not sure how you achieve that without killing IRFinder itself. And I don't know why you expect downstream things still work without completely running the beginning part.
What system setup you're working with? How many memory you allocated for IRFinder and how many cores were you using? Human transcriptome GRCh37.75 requires at least 48G memory and 4 cpu threads for both STAR and IRFinder. In terms of that, a typical reference building time is around two hours. If you're working on your own PC, you must know this information. If you're working on a cluster, make sure you allocate sufficient computational resources before running IRFinder.
If you think you have enough cores and memory and if you really think it's generateReadsError.pl
that fails your building, you can look into that as well. You might want to independently run that particular part that calls generateReadsError.pl
, to investigate why it takes so long or if there is any internal error message from Perl. Here is a shortcut for you (copied from bin/util/Mapability
script):
time "$STAREXEC" \
--genomeDir "$STARGENOME" \
--genomeLoad NoSharedMemory \
--runThreadN $THREADS --outStd SAM --outSAMmode NoQS \
--outSAMattributes None \
--outFilterMultimapNmax 1 \
--readFilesIn <("$LIBEXEC/generateReadsError.pl" 70 10 < "$FA") \
| \
awk -v tmpdir="$TMPCHR" -v tmpcmp="$TMPCMP" -v tmpext="$TMPEXT" 'BEGIN {FS="[\t!]"; OFS="\t"} (($8 == "70M") && ($3 == $6) && ($2 == $5)) {print $5, $6-1, $6+69 | ( tmpcmp " -c1 > " tmpdir "/" $5 ".bed." tmpext ) }'
In the above line:
$FA
with the path to your genome.fa
file $STAREXEC
with your path of STAR execution $STARGENOME
with your path of STAR genome, which IRFinder just builtbin/util/Mapability
. Just read.), Call it from bash directly. Does it give you any error? If it just needs time to complete, the problem might be due to your system instead of being due to IRFinder.
Nevertheless. let's work around your generateReadsError.pl
issue first so that we can make sure the reference can be successfully built in the first place. That might directly solve your problem.
Best, Dadi
Hi. I got the following errors in the quantification step.
Actually, I also had a problem in running generateReadsError.pl. It took a lot of time, more than one day. So I killed the generateReadsError.pl. I don't know whether this caused the quantification errors.
Below are more details which might facilitate troubleshooting.
By the way, the GTF file and the FA file were named as the manual requested.
Dependencies gcc --version: 5.4.0 STAR --version: 2.5.2b bedtools --version: v2.20.1
Reference building command line
IRFinder -m BuildRefProcess -r irfinderdb -b irfinderdb/exclude.bed.gz
No error messages came out in this step except that I killed running generateReadsError.pl.Quantification command line
IRFinder -m BAM -r irfinderdb Unsorted.bam
Thanks very much! Any help will be greatly appreciated.
Best, Mei