tba123 / rna-star

Automatically exported from code.google.com/p/rna-star
0 stars 0 forks source link

Number of mismatches report in the SAM file is not correct #24

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Running STAR is successful, problem is about some inconsistency of the 
alignment.
2.
3.

What is the expected output? What do you see instead?

below are a pair of reads aligned to 3 different places, one end align chr1, 
the other end align to chr1 and ch3 as a fusion read. I compared the reference 
sequences, there is no mismatch. But the nM field indicates two mismatches.

C185NACXX121031:7:1214:18090:96585  337 chr3    33430704    3   68S33M  chr1    569497  0   TTCT
AGTAAGCCTCTACCTGCACGACAACACATAATGACCCACCAATCACATGCCTATCATATAGTAAGCCCCTAAATCATCAC
CAGAATGTCTATCCATG   >CADC>:DBBB@CDDBEEBDDDBDHEHECCGGFIGHF@IGGGJJJIIJIJIIHIIJ
JIIIBIIJJHJIJJJIIHHIGEIHGHFJJJIGHGHHHFFFFFCCC   NH:i:2  HI:i:1  AS:i:36 nM:i:2
C185NACXX121031:7:1214:18090:96585  163 chr1    569497  3   101M    =   569722  293 TAGTTATTA
TCGAAACCATCAGCCTACTCATTCAACCAATAGCCCTGGCCGTACGCCTAACCGCTAACATTACTGCAGGCCACCTACTC
ATGCACCTAATT    BCCDFFFFGHHHHJJJJJJJIIJJIJJJJIJJJJIJJJHJFIJJGIJJJIBFHIJJJJGIHHHF
FDDEDCDDDDDDDCDDDDDBDDDDDDCDDDDDDDCD:   NH:i:2  HI:i:2  AS:i:168    nM:i:2
C185NACXX121031:7:1214:18090:96585  83  chr1    569722  3   68M33S  =   569497  -293    TTCTAGT
AAGCCTCTACCTGCACGACAACACATAATGACCCACCAATCACATGCCTATCATATAGTAAGCCCCTAAATCATCACCAG
AATGTCTATCCATG  >CADC>:DBBB@CDDBEEBDDDBDHEHECCGGFIGHF@IGGGJJJIIJIJIIHIIJJIIIBIIJ
JHJIJJJIIHHIGEIHGHFJJJIGHGHHHFFFFFCCC   NH:i:2  HI:i:2  AS:i:168    nM:i:2

What version of the product are you using? On what operating system?
STAR_2.3.0e_r291

Please provide any additional information below.

Original issue reported on code.google.com by zhenhuaw...@gmail.com on 6 May 2014 at 4:54

GoogleCodeExporter commented 9 years ago
It was aligned to hg19

Original comment by zhenhuaw...@gmail.com on 6 May 2014 at 4:55

GoogleCodeExporter commented 9 years ago
For a faster response, please post your questions in the STAR forum 
https://groups.google.com/d/forum/rna-star, or e-mail me directly at 
dobin@cshl.edu 

Please switch to the latest patch from 
http://it-collab01.cshl.edu/shares/gingeraslab/www-data/dobin/STAR/STARreleases/
Patches/

nM is the sum of numbers of mismatches in both mates.
If you need standard NM (edit distance for each mate), please use (with the 
latest patch) --outSAMattributes NH HI NM MD

Original comment by adobin@gmail.com on 8 May 2014 at 4:49