samtools / htsjdk

A Java API for high-throughput sequencing data (HTS) formats.
http://samtools.github.io/htsjdk/
283 stars 242 forks source link

Add an option to SAMFileWriter to disable checking of ordering or rec… #1599

Closed tfenne closed 2 years ago

tfenne commented 2 years ago

…ords when presorted=true.

Description

This change adds a method to SAMFileWriter (not the factory) to enable/disable the code that checks that records are added in order when specifying a sort order and that the data is pre-sorted. The reasons for doing this are two-fold:

  1. When processing a queryname sorted file produced by samtools sort -n the difference of opinion in what constitutes queryname sort order causes HTSJDK to throw an exception when trying to write the file out while retaining the sort order in the header. This is a non-trivial pain point.
  2. It will increase performance (perhaps only a little)
codecov-commenter commented 2 years ago

Codecov Report

Merging #1599 (691e3b7) into master (8f82871) will decrease coverage by 0.006%. The diff coverage is 72.222%.

@@               Coverage Diff               @@
##              master     #1599       +/-   ##
===============================================
- Coverage     69.839%   69.833%   -0.006%     
- Complexity      9647      9648        +1     
===============================================
  Files            702       703        +1     
  Lines          37638     37647        +9     
  Branches        6113      6115        +2     
===============================================
+ Hits           26286     26290        +4     
- Misses          8903      8905        +2     
- Partials        2449      2452        +3     
Impacted Files Coverage Δ
.../main/java/htsjdk/samtools/AsyncSAMFileWriter.java 33.333% <0.000%> (-5.128%) :arrow_down:
src/main/java/htsjdk/samtools/SAMFileWriter.java 0.000% <0.000%> (ø)
...c/main/java/htsjdk/samtools/SAMFileWriterImpl.java 79.747% <86.667%> (-1.075%) :arrow_down:
...htsjdk/samtools/util/nio/DeleteOnExitPathHook.java 80.952% <0.000%> (-9.524%) :arrow_down:
src/main/java/htsjdk/samtools/SAMTextWriter.java 67.416% <0.000%> (+1.124%) :arrow_up:
src/main/java/htsjdk/io/AsyncWriterPool.java 73.611% <0.000%> (+1.389%) :arrow_up:
tfenne commented 2 years ago

@nh13 Happy to add a test so long as @lbergelson and/or others are on board with this PR in general. I looked briefly at the tests and adding one didn't seem trivially easy, so I figured I'd wait and make sure there aren't conceptual objections before spending that time.

tfenne commented 2 years ago

Added a test. Intent is to merge this once tests pass on CI.

tfenne commented 2 years ago

@lbergelson I don't object to making this option available through the factory, I just found it to be rather cumbersome. There are so many methods in there and I didn't want to take the time to figure out which of those need amending vs. not. Is this something you feel strongly about?

lbergelson commented 2 years ago

No, I don't feel strongly. It would be nice but it is definitely a huge hassle.

tfenne commented 2 years ago

Going to merge this as is then, and we can circle back and add it to the factory at some point.