DaehwanKimLab / tophat

Spliced read mapper for RNA-Seq
http://ccb.jhu.edu/software/tophat
Boost Software License 1.0
90 stars 46 forks source link

Interoperability problems regarding read fields RNAME, RNEXT, POS and PNEXT for unmapped reads with mapped mate and MAPQ for all unmapped reads #17

Open cbrueffer opened 8 years ago

cbrueffer commented 8 years ago

Downstream software like the Picard suite and samtools have issues with the values TopHat fills into the fields RNAME, RNEXT, POS and PNEXT for unmapped reads that have a mapped mate. Some errors these tools produce can be seen in this SeqAnswers thread: http://seqanswers.com/forums/showthread.php?t=28155

I've been able to overcome these issues by assigning new values to the fields of the unmapped reads in question:

RNAME: RNAME of the paired read RNEXT: RNAME of the paired read POS: POS of the paired read PNEXT: 0

Additionally, I had to set MAPQ to 0 for all unmapped reads.

These workarounds are implemented in https://github.com/cbrueffer/tophat-recondition , which may speed up verification and testing.