broadinstitute / picard

A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
https://broadinstitute.github.io/picard/
MIT License
984 stars 369 forks source link

CollectRnaSeqMetrics failed to work for the RIBOSOMAL INTERVALS on the 639th sequence of reference genome #1363

Open wangyugui opened 5 years ago

wangyugui commented 5 years ago

Bug Report

CollectRnaSeqMetrics failed to work for the RIBOSOMAL INTERVALS on the 639th sequence of reference genome, but is OK for the INTERVALS on the 25th/1st sequence.

Affected tool(s)

CollectRnaSeqMetrics RIBOSOMAL_INTERVALS=File

Affected version(s)

Description

CollectRnaSeqMetrics failed to count any reads mapped to the RIBOSOMAL INTERVALS region on 639th seq.

Steps to reproduce

test case 1: refernce genome with 648 sequence, MT is the 639th sequence. MT:RNR1 as the rRNA interval. CollectRnaSeqMetrics failed to count any reads mapped to this region.

test case 2: refernce genome with 195 sequence, MT is the 25th sequence. MT:RNR1 as the rRNA interval. CollectRnaSeqMetrics is OK to count the reads mapped to this region.

test case 3: refernce genome with 648 sequence, chr1 is the 1st sequence. chr1:WASH7Pas the rRNA interval(just for test). CollectRnaSeqMetrics is OK count the reads mapped to this region.

wangyugui commented 5 years ago

Hi,

This dirty patch works for some test cases.

diff --git a/src/main/java/picard/analysis/directed/RnaSeqMetricsCollector.java b/src/main/java/picard/analysis/directed/RnaSeqMetricsCollector.java
index b03904212..64fcb17c8 100644
--- a/src/main/java/picard/analysis/directed/RnaSeqMetricsCollector.java
+++ b/src/main/java/picard/analysis/directed/RnaSeqMetricsCollector.java
@@ -149,9 +149,13 @@ public class RnaSeqMetricsCollector extends SAMRecordMultiLevelCollector<RnaSeqM
             // Attempt to get an interval for the entire fragment (if paired read) else just use the read itself.
             // If paired read is chimeric or has one end unmapped, don't create an interval.
             final Interval fragmentInterval;
+           int i1=rec.getReferenceIndex();
+           int i2=rec.getMateReferenceIndex();
+           boolean flag=rec.getMateUnmappedFlag();
+
             if (!rec.getReadPairedFlag()) {
                 fragmentInterval = readInterval;
-            } else if (rec.getMateUnmappedFlag() || rec.getReferenceIndex() != rec.getMateReferenceIndex()) {
+            } else if ( flag || i1!=i2 ) {
                 fragmentInterval = null;
             } else {
                 final int fragmentStart = Math.min(rec.getAlignmentStart(), rec.getMateAlignmentStart());

This should be a bug of htsjdk or picard.

wangyugui commented 5 years ago

And the test result show that picard CollectorRnaSeqMetrics does NOT check the strand of rRNA interval.