BioJulia / IntervalTrees.jl

A data structure for efficient manipulation of sets of intervals
MIT License
44 stars 17 forks source link

Some optimizations to tree intersection #46

Open dcjones opened 6 years ago

dcjones commented 6 years ago

Here's some progress on reducing overhead in the tree intersection iterator. My test case is finding all intersections between RNA-Seq reads and annotated transcripts in a dataset with ~18 million such intersections. With this I was able to improve run time from 46 seconds to 6.7 seconds.

Don't ask me to explain why storing the state in nullables works better than Union{Nothing, ...}. The compiler works in mysterious ways. :man_shrugging:

StefanKarpinski commented 6 years ago

Please do report any cases that you encounter like this where Union{Nothing, ...} doesn't have the performance it should so that we can investigate and improve the compiler to fix it!

codecov[bot] commented 6 years ago

Codecov Report

Merging #46 into master will decrease coverage by 0.47%. The diff coverage is 82.75%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #46      +/-   ##
==========================================
- Coverage   97.65%   97.18%   -0.48%     
==========================================
  Files           3        3              
  Lines         596      603       +7     
==========================================
+ Hits          582      586       +4     
- Misses         14       17       +3
Impacted Files Coverage Δ
src/IntervalTrees.jl 97.14% <82.75%> (-0.55%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 2f63a6c...2e6074d. Read the comment docs.