What steps will reproduce the problem?
1. Align long (100+ bp) reads with STAR, either Paired End or Single Read
2. Run RSeQC Junction Annotation script on aligned file
What is the expected output? What do you see instead?
In the past, we have run this function on 50bp single read experiments and
observed relatively low percentages of splicing junctions and events. This was
true for both STAR and Tophat alignments. Now, I am seeing huge percentages of
novel splicing junctions in my 100bp PE experiments aligned with STAR. It
ranges from 30% to 60% novel splicing junctions in several different samples. I
determined that aligning these long reads with Tophat will not cause the same
output in RSeQC. I also found that read length had the biggest effect on the
reported percentage of novel junctions, regardless of pairedness. Note that it
is only junctions that are suspicious, while splicing events remains below 10%
in every sample.
What version of the product are you using? On what operating system?
RSeQC v2.3.7 on 2.6.32-358.18.1.el6.x86_64 GNU/Linux
Please provide any additional information below.
The most unusual part about this is that my "novel" junctions are really +/- 1
or 2 bases from the annotated junctions! Even better, this only occurs where
the transcript sequence is not disturbed by splicing in a different place. In
other words, if the first base of the spliced out intron is the same as the
first base of the next exon, it is typically reported as a novel splicing
event. The true, annotated location is also reported by the program.
For an example, check the following intron in a genome browser (attached):
chr6:74,227,988-74,228,076. If you splice out chr6:74,227,989-74,228,077
instead, the transcript sequence is identical, but it is now called a novel
splice. This is exactly the behavior I am getting in RSeQC. Can anyone describe
how RSeQC determines a novel splice junction and why these novel junctions are
not consistent with the STAR splice junction output log?
Original issue reported on code.google.com by kroll...@osu.edu on 2 Dec 2013 at 9:15
Original issue reported on code.google.com by
kroll...@osu.edu
on 2 Dec 2013 at 9:15Attachments: