tzaeschke / tinspin-indexes

Spatial index library with R*Tree, STR-Tree, Quadtree, CritBit, KD-Tree, CoverTree and PH-Tree
http://www.tinspin.org
Apache License 2.0
111 stars 24 forks source link

New lazy nearest neighbor ranged search for RTree, with Iterator.remove() support #8

Closed chris0385 closed 7 years ago

codecov-io commented 7 years ago

Codecov Report

Merging #8 into master will decrease coverage by -0.26%. The diff coverage is 61.97%.

@@             Coverage Diff              @@
##             master       #8      +/-   ##
============================================
- Coverage     69.81%   69.56%   -0.26%     
- Complexity     1147     1196      +49     
============================================
  Files            46       48       +2     
  Lines          5371     5582     +211     
  Branches       1105     1158      +53     
============================================
+ Hits           3750     3883     +133     
- Misses         1383     1433      +50     
- Partials        238      266      +28
Impacted Files Coverage Δ Complexity Δ
src/main/java/org/tinspin/index/rtree/RTree.java 77.27% <100%> (+2.41%) 51 <2> (+4) :white_check_mark:
...ain/java/org/tinspin/index/rtree/RTreeNodeDir.java 93.75% <100%> (+0.2%) 15 <0> (ø) :x:
...java/org/tinspin/index/rtree/DistanceFunction.java 44.06% <31.7%> (-28.16%) 5 <4> (+4)
src/main/java/org/tinspin/index/rtree/Filter.java 57.89% <57.89%> (ø) 2 <2> (?)
.../java/org/tinspin/index/rtree/RTreeMixedQuery.java 68.96% <68.96%> (ø) 39 <39> (?)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 9851c2a...d484603. Read the comment docs.

tzaeschke commented 7 years ago

SQRT slow: According to this, FSQRT takes just one cycle on Pentium 2 or later CPUs (or on SSE). There is some latency, but I was never able to see a performance difference during my index tests. I'll leave the code in, but I will mark it as deprecated for now, possibly removing it at some point if we cannot see a benefit. It may actually harm performance by providing 4 implementation of DistanceFunction, which causes the JVM to resort to expensive polymorphism rather than the fast bimorphism (see here), depending on how it is used. Did you see any improvement during your tests from avoiding SQRT?

chris0385 commented 7 years ago

I didn't make any test on sqrt() performance before your comment. I did some test now and there were indeed no noticeable performance impacts. We can therefore remove CENTER_SQUARE and EDGE_SQUARE (and maybe add a comment that the sqrt has no measurable impact - at least on my Intel Core i5 with OpenJDK 64-Bit version 1.8).

Thank you for the reference on the article about the JVM and polymorphism.

I also made some test of the performance of my Iterator.remove() method. Currently putting the elements in an ArrayList and doing a tree.remove(e.lower(), e.upper()); after the iteration is faster (up to 10 times faster when removing 10000 elements in a tree of 20000). When removing many elements, it's probably worth implementing this extra step rather than using the current remove() method.

tzaeschke commented 7 years ago

I pushed everything into the master branch now. Thanks a lot for the contribution! :-) If you like, I can create a contributor list in the README, with your name or GitHub alias? Let me know.