alexdobin / STAR

RNA-seq aligner
MIT License
1.86k stars 506 forks source link

2.5.2a writes incorrect mate position (outside of the chromosome) #153

Closed chihlee closed 8 years ago

chihlee commented 8 years ago

Hello Alex,

Please see below and let me know if you need anything else to troubleshoot. Thanks.

Best, Steve

Alignments by 2.5.2a:

NS500507:219:HVCK5BGXX:2:11103:21605:11694      2209    chr2L   53762   50      5S15M131H       chr2R   38168885        0       CGCGATTTCCAATCCGCATC    AAAAAEEEAEEEEEEEAEEE    NH:i:1   NM:i:0  MD:Z:15 RG:Z:1  SA:Z:chr2R,14575845,+,20S130M1S,50,0;
NS500507:219:HVCK5BGXX:2:11103:21605:11694      163     chr2R   14575845        50      20S130M1S       =       14575925        229     CGCGATTTCCAATCCGCATCCACACACACCAACCAAGTGAATATAATATGTAAGCGAAAGGCAAGAAGCCAGCAACAAAGCGCCAGTGATAAGCGCAATAATATAAATGCAATATAGTATAAATTCAACGTGAAAAGAAATCTCGCCAGCC  AAAAAEEEAEEEEEEEAEEEEEE/AEAEEEAEEEEEEEEEAEAEEEEEA/EAE/EE/E/EEEEAA6AAAA/EAAEEAEAE/EEEEEE/EE//EA</AAE/EE/AA6<<EE</EA//EA/A/A<A/<<A</A<6AA<E/E/<</6<<</A</  NH:i:1  NM:i:0  MD:Z:130        RG:Z:1  SA:Z:chr2L,53762,+,5S15M131H,50,0;
NS500507:219:HVCK5BGXX:2:11103:21605:11694      83      chr2R   14575925        50      57M2I92M        =       14575845        -229    ATATAAATGCAATATAGTATAAATTCAACGTGAAAAGAAATCTCGCCACCGAAATACAAAAAAAAAACCGGAGGCAGCGACAAATGGAAATCAAGTAAAATGAAGAAGAGCGGAGAGATAGAAATCGTAATGTGTAAGCGCGACCAGAGCG  <A6AAEEAAAEE/A/EA/EEEAE<</<EEA//<EEE/EEEAAEE/EEA/<AEEE/EEEEEEEEEEEE/EAEEA<<<EEAEEEEEEA/EEEEAEAEEEEEEEEEEEEAEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEE6EEAAAAA  NH:i:1  NM:i:5  MD:Z:48G16G6A76 RG:Z:1

Length of chr2R is 21146708 but the mate position (38168885) is greater than the length.

2.5.1b doesn't seem to have this issue.

NS500507:219:HVCK5BGXX:2:11103:21605:11694      2209    chr2L   53762   50      5S15M131H       chr2R   14575925        0       CGCGATTTCCAATCCGCATC    AAAAAEEEAEEEEEEEAEEE    NH:i:1   NM:i:0  MD:Z:15 RG:Z:1  SA:Z:chr2R,14575845,+,20S130M1S,50,0;
NS500507:219:HVCK5BGXX:2:11103:21605:11694      163     chr2R   14575845        50      20S130M1S       =       14575925        229     CGCGATTTCCAATCCGCATCCACACACACCAACCAAGTGAATATAATATGTAAGCGAAAGGCAAGAAGCCAGCAACAAAGCGCCAGTGATAAGCGCAATAATATAAATGCAATATAGTATAAATTCAACGTGAAAAGAAATCTCGCCAGCC  AAAAAEEEAEEEEEEEAEEEEEE/AEAEEEAEEEEEEEEEAEAEEEEEA/EAE/EE/E/EEEEAA6AAAA/EAAEEAEAE/EEEEEE/EE//EA</AAE/EE/AA6<<EE</EA//EA/A/A<A/<<A</A<6AA<E/E/<</6<<</A</  NH:i:1  NM:i:0  MD:Z:130        RG:Z:1  SA:Z:chr2L,53762,+,5S15M131H,50,0;
NS500507:219:HVCK5BGXX:2:11103:21605:11694      83      chr2R   14575925        50      57M2I92M        =       14575845        -229    ATATAAATGCAATATAGTATAAATTCAACGTGAAAAGAAATCTCGCCACCGAAATACAAAAAAAAAACCGGAGGCAGCGACAAATGGAAATCAAGTAAAATGAAGAAGAGCGGAGAGATAGAAATCGTAATGTGTAAGCGCGACCAGAGCG  <A6AAEEAAAEE/A/EA/EEEAE<</<EEA//<EEE/EEEAAEE/EEA/<AEEE/EEEEEEEEEEEE/EAEEA<<<EEAEEEEEEA/EEEEAEAEEEEEEEEEEEEAEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEE6EEAAAAA  NH:i:1  NM:i:5  MD:Z:48G16G6A76 RG:Z:1

parameters file

genomeDir  /genomes/Drosophila_melanogaster/UCSC/dm3/Sequence/STARindex2
outSAMattrRGline ID:1 "SM:2wk_Con_Midnight_1"
outSAMattributes NH NM MD
outSAMtype BAM SortedByCoordinate
limitBAMsortRAM 12000000000
outSAMunmapped Within
outSAMmapqUnique 50
readFilesIn "R1.fastq.gz" "R2.fastq.gz"
readFilesCommand zcat
sjdbGTFfile "/genomes/Drosophila_melanogaster/UCSC/dm3/Annotation/Genes/genes.gtf"
outFilterType BySJout
outSJfilterCountUniqueMin -1 2 2 2
outSJfilterCountTotalMin -1 2 2 2
outFilterIntronMotifs RemoveNoncanonical
twopassMode Basic
chimSegmentMin 12
chimJunctionOverhangMin 12
chimScoreDropMax 30
chimSegmentReadGapMax 5
chimScoreSeparation 5
chimOutType WithinBAM

Log.out

chihlee commented 8 years ago

Update: The same issue was seen in 2.5.1b as well.

alexdobin commented 8 years ago

Hi Steve,

I hope I fixed the bug. Please try the latest master. If it solves the problem, I will make a tagged release.

Cheers Alex

chihlee commented 8 years ago

Thanks, Alex! I'll try it and get back to you later.

Best, Steve

chihlee commented 8 years ago

Hello Alex,

I tried the master on the same data. Looks like the issue has been fixed. Thanks a lot!

Best, Steve

alexdobin commented 8 years ago

Hi Steve

great - please let me know if you see any other issues with this option.

Cheers Alex