itmat / rum

RNA-Seq Unified Mapper
http://cbil.upenn.edu/RUM
MIT License
26 stars 4 forks source link

MAPQ should be set to "0" for unmapped reads in SAM file #171

Closed delagoya closed 11 years ago

delagoya commented 11 years ago

Picard tools expect that unmapped reads in the SAM file have the MAPQ field be set to zero .

mdelaurentis commented 11 years ago

Do you recall what version you saw this in? What was it set to for unmapped reads? In the latest version (on the develop branch) it gives the following values:

Unique mappers: 25 Non-unique mappers: 0 No mapping: 0

I believe I set it up that way back in October based on some emails from Greg indicating that that's how he wanted it to work. Is this not the behavior that you saw? If so then there's a bug...

delagoya commented 11 years ago

I am not sure what is complaining about actually. Here is an error (RUM.sam from RUM version 2.0.5_02):

Exception in thread "main" net.sf.samtools.SAMFormatException: Error parsing text SAM file. MAPQ should be 0 for unmapped read.; File RUM.sam; Line 71 Line: seq.4 165 chr5 90464131 25 * = 9046413 0 GCAAACCTTCGTGGAAACTATGGTGAACTGGGTGACTGCTGGACAATACAAGATCCCGATAGAAACGAATGTTTCCTGCAACACAAAGATGAAAAACCCAG +:=D++2++22++)<EAC:+<<+++22A13)1)11:?4?*?8??####################################################### XO:A:F IH:i:1 HI:i:1 at net.sf.samtools.SAMTextReader.reportErrorParsingLine(SAMTextReader.java:230) at net.sf.samtools.SAMTextReader.access$500(SAMTextReader.java:40) at net.sf.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:434) at net.sf.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:278) at net.sf.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:250) at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:664) at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:642) at net.sf.picard.sam.FixMateInformation.doWork(FixMateInformation.java:147) at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177) at net.sf.picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:119) at net.sf.picard.sam.FixMateInformation.main(FixMateInformation.java:75

delagoya commented 11 years ago

I think this is because the mate pair is mapped and the offending one is not? Here are the two mate pairs:

seq.4 89 chr5 90464131 25 65M657N36M = 90464131 0 GAAAGGCCAGAGGCTGATGCCATGTGCCCCTCCCATAAGGAAAACCCAACCACCTTTATGGGAC ACTATTTGCATGAAGTTGCCAGAAGACATCCTTATTT +:=B1+22:3<A,AC4<<C+<CC;F,3A9ACDF??E4?EF IC3?9?9B3D9?8)(77BDCCC)=A#################################### XO:A:F MD:Z:17G 9A5TT66 NM:i:4 IH:i:1 HI:i:1 XS:A:+ seq.4 165 chr5 90464131 25 * = 90464131 0 GCAAACCTTCGTGGAAACTATGGTGAACTGGGTGACTGCTGGACAATACAAGATCCCGATAGAAACGAATGT TTCCTGCAACACAAAGATGAAAAACCCAG +:=D++2++22++)<EAC:+<<+++22A13)1)11:?4?*?8??## ##################################################### XO:A:F IH:i:1 HI:i:1

mdelaurentis commented 11 years ago

Yes, that's probably it. The 0x4 (unmapped) bit is set on the offending one and not on the other one. I'll fix it, I can't imagine it will take long. Thanks.

On Thu, May 23, 2013 at 11:31 AM, Angel Pizarro notifications@github.comwrote:

I think this is because the mate pair is mapped and the offending one is not? Here are the two mate pairs:

seq.4 89 chr5 90464131 25 65M657N36M = 90464131 0 GAAAGGCCAGAGGCTGATGCCATGTGCCCCTCCCATAAGGAAAACCCAACCACCTTTATGGGAC ACTATTTGCATGAAGTTGCCAGAAGACATCCTTATTT +:=B1+22:3<A,AC4<<C+<CC;F,3A9ACDF??E4?EF IC3?9?9B3D9?8)(77BDCCC)=A#################################### XO:A:F MD:Z:17G 9A5TT66 NM:i:4 IH:i:1 HI:i:1 XS:A:+ seq.4 165 chr5 90464131 25 * = 90464131 0 GCAAACCTTCGTGGAAACTATGGTGAACTGGGTGACTGCTGGACAATACAAGATCCCGATAGAAACGAATGT

TTCCTGCAACACAAAGATGAAAAACCCAG +:=D++2++22++)<EAC:+<<+++22A13)1)11:?4?*?8??## ##################################################### XO:A:F IH:i:1 HI:i:1

Reply to this email directly or view it on GitHubhttps://github.com/PGFI/rum/issues/171#issuecomment-18351152 .