amplab / snap

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data
https://www.microsoft.com/en-us/research/project/snap/
Apache License 2.0
288 stars 66 forks source link

SAM output of mate pairs when only one is mapped #2

Closed rnpandya closed 12 years ago

rnpandya commented 12 years ago

From Ilya Chorny ichorny@gmail.com

Ignoring SAM validation error: ERROR: Record 133093, Read name D0KU2ACXX_0:5:1103:3721232:0/1, Mapped mate should have mate reference name.

For some reason Picard wants the reference name populated even if the mate is unmapped.

D0KU2ACXX_0:5:1103:3721232:0/1 97 chr3 193863968 60 5=1X1=1X33=1X12=1X5=1X10=1X3=3X3=2X1=1X2=1X12= * 0 0 TGGGGGGTGGATGGGATAGAAAACTTGAATGAGGCTTCTGAAACCCCAGATTTGAAAAGATCAAACGCTTTAGAGTTCCAATATTTTATTTCCCATGTTC =><A7)/,)&(3::(8(88=::;:8:((+(:33(288<:(8(((((&20(+(33(+(++8(883;()&(((((+((+++4((++((+(+4(((+(+(588 [ichorny@ukch-tst-lnmo01 NA12878_fastq]$ samtools view test.bam | grep D0KU2ACXX_0:5:1103:3721232:0/2 D0KU2ACXX_0:5:1103:3721232:0/2 133 * 0 0 * chr3 193863968 0 AATATTCACGCTAATCTCCTGCCGCAGCCTCCCGATTAGCTGGTATTACAGGCATGCTCCACCCTTCCCGGCAAATTTTGTTTTTTTAGTAGAGATTGAG (+(4B<5)&((((((2(5&)&0&(2(2''3',((((@;6.(())?3C;;?==7==7(('(.'(0(())01*GC3:8:)C)EHFC<,+2,,22+++:1+

rnpandya commented 12 years ago

From Ilya:

...you should put the chromosome name (of the mapped mate) in the 3rd column (RNAME) for reads which are unmapped but have mates which are mapped. This is not a critical error and can be worked around but it will cause SAM validators to complain.

bolosky commented 12 years ago

This is fixed in 0.13.7.