Open junjiemama opened 3 weeks ago
The sequence of splice junctions in each aligned read is checked against the splice junctions of annotated isoforms. If the junctions for the read are the same as the junctions in an annotated isoform then the read is a full-length transcript for an annotated isoform. If an alignment has a sequence of junctions that would match an annotated isoform, but the alignment is missing one or more junctions at the beginning or end, then that alignment could be a non full-length transcript for an annotated isoform
For alignments with novel junctions there is a similar check to see if the sequence of junctions would match another novel alignment but with one or more junctions missing at the beginning or end. The novel alignments which do not match to any longer splice junction sequence are considered full-length transcripts. The alignments which are missing some junctions are non full-length
Thank you for your reply!
In the algorithm, there is a step of telling the full-length transcripts apart from non full-length transcripts. May I ask what kind of the criteria the pipeline used for this process? Thank you!