Open slazicoicr opened 5 years ago
It looks like the reason for reversing @cigarOp
is to calculate various statistics by cycle. If done in a clear and well structured way, that might be OK, but this code falls a long way short. Something to refactor when time permits.
I think the reversing does introduce false data at
@Op
has the reversed cigar string, but the $start1
position and the incrementing $start1 + $posOffset
are not changed to reflect a reversed fragment. I think it is correct to use the non-reversed cigar string here.
I don't think this error is propagated, as sub runningBaseCoverage
does not use the incorrect positions for calculations and a quick search did not find the the values of {runningBaseCoverage}
to be used anywhere else.
The parsed CIGAR string is reversed if it is on the - strand: https://github.com/oicr-gsi/bamqc/blob/a3964603efc86334e3345d30c331c9aacab256b2/lib/GSI/bamqc.pm#L1900
I assume the
$strand
variable is derived from the read reverse strand (16) flag.I am not sure how this feeds into downstream calculations, but I think that reversing it is unnecessary and could cause issues:
This thread states that reads from the reverse strands are reverse-complemented by aligners to ensure their orientation matches the reference. This means that:
So by reversing the CIGAR string:
By never reversing the CIGAR string, the CIGAR string, read sequence, and reference sequence would always remain in sync.