fulcrumgenomics / fgbio

Tools for working with genomic and high throughput sequencing data.
http://fulcrumgenomics.github.io/fgbio/
MIT License
314 stars 67 forks source link

PileupBuilder should not report insertions when checking the final mapped base before soft-clipping #956

Closed jrm5100 closed 9 months ago

jrm5100 commented 10 months ago

I was using the PileupBuilder and ran into an error when attempting to collect an insertion sequence:

case entry: InsertionEntry if entry.rec.start < pos  =>
  // Get insertion bases
  val insStart     = entry.rec.readPosAtRefPos(pos, returnLastBaseIfDeleted = true)
  val insEnd       = entry.rec.readPosAtRefPos(pos + 1, returnLastBaseIfDeleted = true)
  val alleleString = entry.rec.basesString.substring(insStart - 1, insEnd - 1)

The call to substring failed because insEnd was 0.

I tracked this down to the PileupBuilder making an incorrect check. rec.length includes soft-clipped bases, so checking offset < rec.length - 1 will always be true at a pileup for the last mapped base as long as there is at least 1 soft-clipped base. The second half of the check (rec.refPosAtReadPos(offset + 1)) will also be true, but for the wrong reason (it's soft-clipped, not an insertion).

I shoehorned in a test where it seems appropriate.

codecov[bot] commented 10 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (abb741b) 95.62% compared to head (bc462fb) 95.62%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #956 +/- ## ======================================= Coverage 95.62% 95.62% ======================================= Files 126 126 Lines 7355 7355 Branches 512 528 +16 ======================================= Hits 7033 7033 Misses 322 322 ``` | [Flag](https://app.codecov.io/gh/fulcrumgenomics/fgbio/pull/956/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=fulcrumgenomics) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/fulcrumgenomics/fgbio/pull/956/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=fulcrumgenomics) | `95.62% <100.00%> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=fulcrumgenomics#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.