Closed sklarz-bgu closed 5 years ago
Dear sklarz-bgu
How did you solve this problem? I'm struggling with the same problem now.
Dear sklarz-bgu and minnieanne
I'm having the same problem and I can't find any solution at all
Dear all,
I am facing the exact same problem and can't find any solution at all too :(
In my case, the solution was to use a different GTF file. Specifically, I had problems with the GENCODE primary assembly GTF file (as recommended in the STAR manual) but switching to the GTF file for reference chromosomes only, mitigated the error.
The problem here seems to be in the GTF, if there are overlapping exons within the same transcript. Most likely this happens if you're using a non-model GTF. These exons cause STAR to report improper transcript lenghts, and altough the alignment step finishes, RSEM can't handle the resulting BAM like that. More info here: https://github.com/alexdobin/STAR/issues/1128
Dear RSEM developers
I've been using RSEM for a while but never seen anything like this. I've got a non-model genome with a gtf annotation. After building a STAR reference for the genome with the gtf file, I map some reads to the genome, followed by quantification with RSEM. I'm following the regular pipeline which has succeeded for me before, even with this genome.
However, I'm trying the same pipeline for a new set of reads, and I get several reads with the following comment:
Fragment HWI-ST132_0470:2:1101:1007:48766#GCGGGC is hung over the end of transcript 33563! It is possible that the aligner you use gave different read lengths for a same read in SAM file.
I've traced this down to line 115 in
PairedEndQModel.h
. These are the values calucated for the variables therein: fpos=408; insertLen=210; totLen=582.The read fails because fpos+insertLen>totLen. However, in the header of the 'toTranscriptome' BAM file, the length is given as LN:1054, according to which the read does not fail the condition.
How does RSEM calculate the transcript length? What could be the reason for this failure? What am I doing wrong?
Thank you very much! Menachem
Attached are the offending sections of the BAM files and GTF file.
offend.genome.sam.txt offend.transcriptome.sam.txt offending.gtf.txt