blachlylab / dhtslib

D bindings and OOP wrappers for htslib
MIT License
7 stars 1 forks source link

Cigar and md changes #58

Closed charlesgregory closed 3 years ago

charlesgregory commented 3 years ago

Pull request for discussion of changes. md.d and cigar.d pulled under dhtslib.sam Can now use opIndex and opSlice for indexing and assignment with Cigar struct directly. Should fix #40 by adding dups in ctors. If ctor is used, Cigar struct should own the cigar string data. Any other changes to be made?

codecov-io commented 3 years ago

Codecov Report

Merging #58 (9284043) into develop (0402081) will increase coverage by 0.25%. The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop      #58      +/-   ##
===========================================
+ Coverage    60.81%   61.07%   +0.25%     
===========================================
  Files           18       18              
  Lines         1503     1513      +10     
===========================================
+ Hits           914      924      +10     
  Misses         589      589              
Impacted Files Coverage Δ
source/dhtslib/sam/package.d 65.67% <ø> (ø)
source/dhtslib/sam/cigar.d 87.27% <100.00%> (ø)
source/dhtslib/sam/md.d 93.02% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 0402081...9284043. Read the comment docs.

jblachly commented 3 years ago
  1. Thanks for tackling this
  2. I should have set up CI and coverage a loooong time ago
    • note that the testing caught the error that needed to be fixed wtih the 4th commit

General comments:

  1. When merging, make sure to reference and /or close #48
  2. Should we continue to make available dhtslib.cigar and dhtslib.sam so as not to break client code? Probably fishnet, the error correction software, and other client code are going to have to be updated unless we do, but it will also clutter the namespace. Feel free to say "no" as we haven't done a public dhtslib release yet and only our code depends. should be easy to find and fix with a regex across codebases

Can the indexing overloads be exercised with test cases?

Oh, scratch that, I see the next commit, and coverage report confirms

charlesgregory commented 3 years ago

Should we continue to make available dhtslib.cigar and dhtslib.sam so as not to break client code? Probably fishnet, the error correction software, and other client code are going to have to be updated unless we do, but it will also clutter the namespace. Feel free to say "no" as we haven't done a public dhtslib release yet and only our code depends. should be easy to find and fix with a regex across codebases

Since dlang tends to be good about versioning its packages I would say no. It can be up to the programmer to update their code for api changes.

charlesgregory commented 3 years ago

So I have removed the dup fixes in cigar as that doesn't truly fix the issue of a cigar potentially having access to free'd bam1_t data. Current plan is using reference counting on both SAMRecord and Cigar. Cigar will have a method that allows one to check its current reference count. SAMRecord then has a private Cigar that it will pass out via cigar() method. Once both the private Cigar and SAMRecord have reference counts of 0 the backing bam1_t can be free'd. Though this may be held up by #57. Any thoughts or concerns?

jblachly commented 3 years ago

So I have removed the dup fixes in cigar as that doesn't truly fix the issue of a cigar potentially having access to free'd bam1_t data. Current plan is using reference counting on both SAMRecord and Cigar. Cigar will have a method that allows one to check its current reference count. SAMRecord then has a private Cigar that it will pass out via cigar() method. Once both the private Cigar and SAMRecord have reference counts of 0 the backing bam1_t can be free'd. Though this may be held up by #57. Any thoughts or concerns?

Hmm. I hate to couple them so tightly. We could really just discourage the constructor that takes the pointer and rely on the consumer to be an expert to use this when you know you won't have a use-after-free problem. There is an alternative constructor that takes a dynamic array or slice that could be favored in other cases.

I can accept this PR as is

charlesgregory commented 3 years ago

After we fix #57, I can make another pull request to further hash the issue of reference counting and ownership.