samtools / htsjdk

A Java API for high-throughput sequencing data (HTS) formats.
http://samtools.github.io/htsjdk/
283 stars 242 forks source link

Fix race condition in SamRecordDuplicateComparator #1517

Closed fleharty closed 3 years ago

fleharty commented 4 years ago

Description

The getLibraryId function of SAMRecordDuplicateComparator.java can create race conditions when run multithreaded.

Things to think about before submitting:

fleharty commented 4 years ago

@lbergelson This resolves the race condition thing we discussed the other day. Would you be able to review this?

codecov-io commented 4 years ago

Codecov Report

Merging #1517 into master will increase coverage by 0.038%. The diff coverage is 78.947%.

@@               Coverage Diff               @@
##              master     #1517       +/-   ##
===============================================
+ Coverage     69.343%   69.381%   +0.038%     
- Complexity      8900      8911       +11     
===============================================
  Files            601       601               
  Lines          35503     35511        +8     
  Branches        5901      5901               
===============================================
+ Hits           24619     24638       +19     
+ Misses          8548      8536       -12     
- Partials        2336      2337        +1     
Impacted Files Coverage Δ Complexity Δ
...ain/java/htsjdk/samtools/DuplicateSetIterator.java 69.492% <0.000%> (ø) 13.000 <0.000> (ø)
.../htsjdk/samtools/SAMRecordDuplicateComparator.java 68.939% <83.333%> (+10.068%) 54.000 <8.000> (+10.000)
...java/htsjdk/samtools/DuplicateScoringStrategy.java 57.576% <0.000%> (+3.030%) 9.000% <0.000%> (+1.000%)
fleharty commented 3 years ago

@lbergelson I'm unsure about populateTransientAttributes. I can't get it to fail, so as far as I know it is okay. Horror of horrors would be that we do have a sneaky race condition that occurs with very low probability. I'm more than willing to synchronize this method if you think it is necessary, but I have no way of testing that change helps anything.

We depricated SAMRecordDuplicateComparator(final List headers) because it isn't used anywhere internally to HTSJDK, and I'm unaware of any external tools using this constructor.

I don't have permissions to merge this PR. Could you do that for me?

fleharty commented 3 years ago

@lbergelson Back to you.