Closed maplesond closed 9 years ago
Just checked tophat, and it seems like it doesn't allow any mismatches for spliced reads. This probably explains why we were getting 0 mismatches for all junctions. To be sure I'll need to try altering this limit in tophat and try again. This probably means that the MaxMMES metric for default tophat output isn't very useful.
I now work out number of mismatches explicitly by taking the query sequence and genomic sequence at the same positions, compensating for any insertions and deletions. The number of mismatches is then the hamming distance between the two sequences.