BioJulia / IntervalTrees.jl

A data structure for efficient manipulation of sets of intervals
MIT License
44 stars 17 forks source link

Some intersect optimizations. #22

Closed dcjones closed 8 years ago

dcjones commented 8 years ago

I saw https://github.com/BioJulia/Bio.jl/issues/340 and it prompted me try to figure out why IntervalTrees does so poorly there.

These changes help make it competitive. But there is still a big issue. The intervals tested in that benchmark contain entries that span entire chromosomes.

Chr1    TAIR10  chromosome      1       30427671        .       .       .       ID=Chr1;Name=Chr1

This is the worst case for the intersect algorithm in IntervalTrees, causing it to become basically a linear search. I'm not sure what do about that. I assumed, maybe incorrectly, when writing this that extremely long intervals were rare. This is maybe an inherent shortcoming of IntervalTrees that suggests we should use NCList.

Feel free to leave this open and I'll add commits if I have any epiphanies.

codecov-io commented 8 years ago

Current coverage is 93.55% (diff: 100%)

Merging #22 into master will increase coverage by 0.39%

@@             master        #22   diff @@
==========================================
  Files             3          3          
  Lines           644        683    +39   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            600        639    +39   
  Misses           44         44          
  Partials          0          0          

Powered by Codecov. Last update 085f48b...93a15ee

coveralls commented 8 years ago

Coverage Status

Coverage remained the same at 93.168% when pulling b67124a82c33c9afbec28ccbefbc851e3d2d7147 on int-opt into 085f48b67856400ad8486da59f840b241f028ead on master.

coveralls commented 8 years ago

Coverage Status

Coverage increased (+0.2%) to 93.353% when pulling 7b284bf9d3d43861bd448c410e61487057dea6f9 on int-opt into 085f48b67856400ad8486da59f840b241f028ead on master.

coveralls commented 8 years ago

Coverage Status

Coverage increased (+0.4%) to 93.558% when pulling 93a15ee0076811f1d0877df68d43d6bc4d0702e6 on int-opt into 085f48b67856400ad8486da59f840b241f028ead on master.

coveralls commented 8 years ago

Coverage Status

Coverage increased (+0.4%) to 93.558% when pulling 93a15ee0076811f1d0877df68d43d6bc4d0702e6 on int-opt into 085f48b67856400ad8486da59f840b241f028ead on master.