38 / d4-format

The D4 Quantitative Data Format
MIT License
156 stars 20 forks source link

Fix coverage spillover bug #79

Closed Jakob37 closed 3 months ago

Jakob37 commented 4 months ago

Tentative fix for the issue mentioned in issue #78.

I have also included the fix from #77 such that I could run cargo build.

It has worked well in my local testing. I have also tested manually recalculating the results using d4tools show for the regions and counting in a Python script. It looks like it is doing the right thing in my case (a single non-indexed d4 file).

The testing.

My bed-file (intervals_only_19.bed)

19      45769709        45782552
19      4090321 4124122
19      49635292        49640143
19      55151767        55157773

Running the master branch version:

$ d4 stat --region bed/intervals_only_19.bed hg002_coverage.d4
19      4090321 4124122 47.764267329368955
19      45769709        45782552        45.254146227516934
19      49635292        49640143        34655.42465471037
19      55151767        55157773        37.25041625041625

Running the current version:

$ ./target/debug/d4tools stat --region data/intervals_19_only.bed ~/data/hg002_coverage.d4
19      4090321 4124122 47.764267329368955
19      45769709        45782552        45.254146227516934
19      49635292        49640143        46.97567511853226
19      55151767        55157773        37.25041625041625

The pattern is the same for me across multiple d4 files I produced. I didn't see it in the provided test data from this repo.

Jakob37 commented 4 months ago

Unsure who is / feel the responsibility for d4-tools at the moment. Could anyone of you @38 @arq5x @brentp look into this, or do you know where to turn? Thanks!

brentp commented 4 months ago

If @38 is around, he's the best to confirm but in the absence of that, I will have a look @Jakob37 thanks very much.

cademirch commented 3 months ago

This is awesome, thank you for fixing this @Jakob37. I've been using Tasks recently and could not figure out why each TaskPartition was being fed positions outside of its boundaries.

Jakob37 commented 3 months ago

This is awesome, thank you for fixing this @Jakob37. I've been using Tasks recently and could not figure out why each TaskPartition was being fed positions outside of its boundaries.

Happy to hear 😄 Thanks @brentp for wrapping this up

brentp commented 3 months ago

Hi, I pushed v0.3.10 with these changes. Please let me know if you need anything else.

arq5x commented 3 months ago

Thank you, @brentp !