samtools / htsjdk

A Java API for high-throughput sequencing data (HTS) formats.
http://samtools.github.io/htsjdk/
283 stars 242 forks source link

Use allele info in VariantContext comparisons for stable sorts #1593

Closed clintval closed 1 year ago

clintval commented 2 years ago

TLDR: This PR will make coordinate VCF sorting in HTSJDK/Picard similar to bcftools.

I tried to use Picard's SortVcf for a VCF comparison task, but I had difficulty because it's output was non-stable based on the full primary information of a variant context (contig, start, and alleles). I finally chose the bcftools implementation because it has a slightly more-stable sorting method. However, I want to ensure future users of Picard don't hit the same issue I did, hence this PR.

Before I write unit tests, is this something you are willing to accept?

codecov-commenter commented 2 years ago

Codecov Report

Merging #1593 (2272c16) into master (b5af659) will decrease coverage by 0.064%. The diff coverage is 0.000%.

:exclamation: Current head 2272c16 differs from pull request most recent head 5089e22. Consider uploading reports for the commit 5089e22 to get more accurate results

@@               Coverage Diff               @@
##              master     #1593       +/-   ##
===============================================
- Coverage     69.839%   69.775%   -0.064%     
+ Complexity      9647      9641        -6     
===============================================
  Files            702       702               
  Lines          37638     37535      -103     
  Branches        6113      6080       -33     
===============================================
- Hits           26286     26190       -96     
+ Misses          8903      8888       -15     
- Partials        2449      2457        +8     
Impacted Files Coverage Δ
...riant/variantcontext/VariantContextComparator.java 0.000% <0.000%> (ø)
...htsjdk/samtools/util/nio/DeleteOnExitPathHook.java 80.952% <0.000%> (-9.524%) :arrow_down:
...sjdk/samtools/util/htsget/HtsgetErrorResponse.java 73.333% <0.000%> (-6.667%) :arrow_down:
...tools/cram/encoding/core/GammaIntegerEncoding.java 86.667% <0.000%> (-6.667%) :arrow_down:
.../samtools/cram/compression/ExternalCompressor.java 69.565% <0.000%> (-4.509%) :arrow_down:
...mtools/cram/encoding/core/BetaIntegerEncoding.java 91.304% <0.000%> (-4.348%) :arrow_down:
...am/encoding/core/CanonicalHuffmanByteEncoding.java 52.941% <0.000%> (-2.941%) :arrow_down:
...ing/core/huffmanUtils/HuffmanParamsCalculator.java 80.000% <0.000%> (-1.818%) :arrow_down:
...va/htsjdk/samtools/util/htsget/HtsgetResponse.java 89.394% <0.000%> (-1.783%) :arrow_down:
...java/htsjdk/beta/codecs/variants/vcf/VCFCodec.java 81.818% <0.000%> (-1.515%) :arrow_down:
... and 39 more
clintval commented 2 years ago

Any interest from the maintainers in this PR?

Anything I can do to help move it closer to merge?

clintval commented 2 years ago

@cmnbroad I hit another instance where this would have helped me.

Do you want me to invest the time to implement some unit tests, or is there no interest in pursuing this PR?

cmnbroad commented 2 years ago

@clintval Stable sounds good to me, but @lbergelson who is out on vacation really needs to weigh in (I'm not actually a maintainer).

lbergelson commented 2 years ago

@clintval I'm sorry I haven't been responsive. This is a good idea and having some tests would be great.

clintval commented 1 year ago

@lbergelson I don't normally write in Java so I hope my last commits are OK!

clintval commented 1 year ago

@lbergelson nice work and thank you kindly for the review!