blachlylab / dhtslib

D bindings and OOP wrappers for htslib
MIT License
7 stars 1 forks source link

VCF Overhaul #105

Closed charlesgregory closed 3 years ago

charlesgregory commented 3 years ago

Lots of changes went into this.

fixes #96.

New Features:

Though I did introduce a lot of new features, I have tried to leave old functionality behind. (Though with the reorganization and the enums, the API will change for most of dhtslib.vcf).

Fixes:

Concerns:

  1. We sometimes report errors through hts_log and have the function return, sometimes we report errors through hts_log and have the function throw an exception, or sometimes we just throw an exception. We should decide formally on how this should be done across the whole package, #99 probably relevant.
  2. Naming in general. I am not sold on my naming of the new enums, I just wanted them to be more informative than the htslib enums.
  3. Could use more asserts and checks
codecov-commenter commented 3 years ago

Codecov Report

Merging #105 (da10a93) into develop (b4e5e01) will increase coverage by 3.41%. The diff coverage is 91.29%.

:exclamation: Current head da10a93 differs from pull request most recent head 1768e09. Consider uploading reports for the commit 1768e09 to get more accurate results Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #105      +/-   ##
===========================================
+ Coverage    80.40%   83.82%   +3.41%     
===========================================
  Files           40       40              
  Lines         2822     3424     +602     
===========================================
+ Hits          2269     2870     +601     
- Misses         553      554       +1     
Impacted Files Coverage Δ
source/dhtslib/sam/record.d 76.61% <79.59%> (-4.64%) :arrow_down:
source/dhtslib/sam/tagvalue.d 95.90% <83.33%> (-0.63%) :arrow_down:
source/dhtslib/vcf/record.d 91.27% <87.07%> (+0.05%) :arrow_up:
source/dhtslib/sam/reader.d 87.60% <94.44%> (+2.96%) :arrow_up:
source/dhtslib/vcf/header.d 95.58% <95.40%> (+70.58%) :arrow_up:
source/dhtslib/vcf/reader.d 96.92% <96.92%> (+2.05%) :arrow_up:
source/dhtslib/sam/cigar.d 89.76% <100.00%> (+0.33%) :arrow_up:
source/dhtslib/vcf/writer.d 100.00% <100.00%> (+47.54%) :arrow_up:
source/htslib/sam.d 52.17% <100.00%> (+2.17%) :arrow_up:
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update b4e5e01...1768e09. Read the comment docs.

charlesgregory commented 3 years ago

Alright I added several changes to the enums. They now refer directly to the htslib BCF_* enums. I expanded some of the enum names to make them more self-explanatory. I made a function for calling bcf_unpack to standardize it for all functions along with unittests to confirm it works.

I also changed the default MAX_UNPACK value to UnpackLevels.None as this should help performance by default and help us better catch bugs where we have not correctly unpacked data before using it. You may disagree with this sentiment. Let me know.

Still todo:

charlesgregory commented 3 years ago

I think this should be ready to go.