blachlylab / dhtslib

D bindings and OOP wrappers for htslib
MIT License
7 stars 1 forks source link

Coordinate system improvements #88

Closed charlesgregory closed 3 years ago

charlesgregory commented 3 years ago

Fixes #81, #82, #85.

codecov-commenter commented 3 years ago

Codecov Report

Merging #88 (2955703) into develop (3f34c1b) will decrease coverage by 0.62%. The diff coverage is 90.66%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop      #88      +/-   ##
===========================================
- Coverage    80.53%   79.90%   -0.63%     
===========================================
  Files           40       40              
  Lines         2805     2832      +27     
===========================================
+ Hits          2259     2263       +4     
- Misses         546      569      +23     
Impacted Files Coverage Δ
source/dhtslib/bed/reader.d 100.00% <ø> (ø)
source/dhtslib/bed/record.d 63.04% <ø> (ø)
source/dhtslib/gff/record.d 82.25% <ø> (ø)
source/dhtslib/sam/record.d 81.25% <0.00%> (ø)
source/dhtslib/vcf/record.d 91.21% <50.00%> (ø)
source/dhtslib/coordinates.d 97.29% <93.61%> (-1.67%) :arrow_down:
source/dhtslib/bed/package.d 100.00% <100.00%> (ø)
source/dhtslib/faidx.d 89.47% <100.00%> (+0.58%) :arrow_up:
source/dhtslib/gff/reader.d 94.28% <100.00%> (+0.16%) :arrow_up:
source/dhtslib/sam/reader.d 86.00% <100.00%> (-0.23%) :arrow_down:
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 3f34c1b...2955703. Read the comment docs.

charlesgregory commented 3 years ago

For ChromCoordinates could you use htslib region parsing code for a potential speed boost (and bug freeness?) Or too hard to use?

Potentially, I will look into htslib and see if that function is exported and how it could be used. Probably more thoroughly tested so its worth a look. Honestly, I need to overload the region structs in dhtslib.sam.reader we did a while ago.

charlesgregory commented 3 years ago

In addition to prior comment on htslib's string parsing which i'd like you to answer, I am also wondering about the use of Coordinate and Coordinates which seems confusing inasmuch as their names are almost identical. It may be too late int he game to change the latter, but what about Interval or Range instead?

The nice thing about version control is any prior implementations we are already using will still work. I am open to changes as long as we don't change the API too often. I would rather we have the best implementation before the paper is out as more people start looking at the code afterwards. After we are done, we can just bump the version to 0.13 or even move to version 1.0.

jblachly commented 3 years ago

For ChromCoordinates could you use htslib region parsing code for a potential speed boost (and bug freeness?) Or too hard to use?

Potentially, I will look into htslib and see if that function is exported and how it could be used. Probably more thoroughly tested so its worth a look. Honestly, I need to overload the region structs in dhtslib.sam.reader we did a while ago.

Great news:

https://github.com/samtools/htslib/blob/43c127ec10d5e38c9f5b5ec3679c3745148f7c61/htslib/hts.h#L1138-L1202

charlesgregory commented 3 years ago

Latest commit changes Coordinates to Interval and removes ChromCoordinates.

jblachly commented 3 years ago

@charlesgregory ChromInterval still exists in current version of PR as type parameter to some function calls in e.g. sam/ ; however, was this type removed ?

charlesgregory commented 3 years ago

I have addressed the remaining concerns in the last two commits. Since we are likely to change this implementation further anyway, I am going to accept this pull request so I can begin further work.