Add all-nearest-neighbors (a1NN) iterator for tree-to-tree lookup

aschampion commented 4 years ago

Add the ability to efficiently find the nearest neighbor in a target tree for each point in a query tree. This works by traversing the query tree depth-first and at each query tree node pruning nodes from the set of candidate subtrees of the target tree that can not potentially hold the nearest neighbors for any point in the query tree node.

This results in speedups on the order of 1.3x versus individual lookup of the query points, and this speedup increases with the size and dimensionality of the trees.

This all-nearest-neighbors or "a1NN" lookup is common in point cloud comparison algorithms, and is our primary use of rstar.

The goods

$ cargo bench all
all to all tree lookup  time:   [372.09 us 375.74 us 379.91 us]                                   

...

all to all point lookup time:   [494.12 us 496.84 us 500.37 us]

Note that this difference in performance grows as the trees get larger.

Details of the algorithm

The algorithm works by traversing the query tree in DFS and keeping a stack of pruned subtrees of the the target tree for each depth of that traversal. These subtrees cover the potential nearest neighbors for any point in the query tree node at that depth. This allows points in the query tree to effectively reuse computation by only needing to search for their nearest neighbor in these subtrees.

The pruning works similarly to the existing pruning for single point lookups with min_max_dist_2, only instead of doing a point-to-envelope tight upper bound, we need to find an envelope-to-envelop tight upper bound. This is implemented in Envelope::max_min_max_dist_2 for AABB. Conceptually it is the maximum over min_max_dist_2 for any point in the envelope, and is naively implemented that way by iterating over the extrema (corners) of the envelope.

Here's an illustrative diagram:

Here p is A.max_min_max_dist_2(&B) and q is B.max_min_max_dist_2(&A). m is the maximum distance between the envelope, so that the diagram can prime your intuition that this distance can do better than that naive bound.

So for each query tree node we consider, we look at the candidate target subtrees of the parent (or children of those subtrees), and find the minimum max_min_max_dist_2 of that set of subtrees. This means that for any point in the query node, there must be a nearest neighbor within that distance in some subtree. Thus we can prune any subtree whose minimum distance to the query node is greater than that distance.

For more details see the implementation. I tried to provide plenty of comments.

Notes

This depends on #40 for consistency in the distances used by the pruning, and tests will not pass until that PR is merged.
There may be a clever optimization for max_min_max_dist_2 like the one for min_max_dist_2 in #35, but I haven't thought of it and am somewhat skeptical because of the combinatorial aspect.
After coming up with this I did a quick lit survey and haven't found an exact match. The closest I've found so far is this paper, but it differs in that:
- It's not an R-tree
- It's doing akNN rather than a1NN (more complicated, although this PR's approach could be generalized in a similar way)
- It uses the less optimal maximum distance between envelopes
Thanks to @clbarnes for feedback.

jefferis commented 4 years ago

Dear Andrew, I realised I dropped the ball on responding to an earlier query on this. Your implementation looks neat and is along the lines of what I was thinking (that reference you linked is one of the all knn refs I had looked at). However I have to say that I am rather disappointed about the final speed-up. I had honestly expected something in the range of an order of magnitude given consistent locality for many points. Is it possible in the end that the density of points in neuron data is just a bit low? Do you know where the algorithm is spending time right now? Is it very different from the naive implementation of scanning all points with regular knn?

aschampion commented 4 years ago

Hi Greg, I think you may have meant to respond at clbarnes/nblast-rs#20, so I'll respond there.

aschampion commented 4 years ago

Reverting this to draft status as there are performance pitfalls for 3d data that will need to be looked into.

Stoeoef commented 4 years ago

Thank you very much for this contribution! This looks really promising and the feature would fit in well with the rstar library.

I'm still trying to understand how all of this works. Thanks a lot for the detailed description and comments, those make this much easier. One thing I'm wondering about is the failing unit test. The distance between the expected / the actual point seems to be a bit too large to be explained just by rounding errors. OTOH, #40 fixes the issue, which may be either a coincidence or the actual fix. I'll try to investigate a bit in that direction.

aschampion commented 4 years ago

There are a few points to consider wrt #40 and precision issues. Most important is that the size of the discrepancy caused by rounding errors, etc., can be a much smaller magnitude than the discrepancy of the distance of the returned neighbor. A discrepancy near the epsilon level in the wrong direction will cause distance_2_if_less_or_equal to return None. This can cause the correct neighbor to be disregarded, and other neighbors to be returned instead, which may be arbitrarily far from the correct neighbor. The same is true for pruning in this PR.

I'm running into this right now after making several fixes and refactors to this PR, in that the all nearest neighbors iterator and the single nearest neighbor function return different results rarely. This isn't because one is wrong, but because both are: sometimes one yields the correct neighbor, sometimes the other. This only shows up consistently when exercised with large (> 100K node) 3D trees, but of course may be happening undetected with one or both methods more frequently.

As a real example, here's a node being ignored by the existing single nearest neighbor lookup via distance_2_if_less_or_equal due to a small difference (0.00000000000000109):

[rstar/src/object.rs:226] distance_2 = 0.008569106766881232
[rstar/src/object.rs:226] max_distance_2 = 0.00856910676688123

which results in this being ignored and thus a much larger difference (4%!) in the returned neighbor vs the correct one given by all_nearest_neighbors:

[rstar/src/algorithm/nearest_neighbor.rs:538] single_neighbor.unwrap().distance_2(neighbors.query) = 0.008943686353908992
[rstar/src/algorithm/nearest_neighbor.rs:538] neighbors.target.distance_2(neighbors.query) = 0.008569106766881232
thread 'algorithm::nearest_neighbor::test::test_all_nearest_neighbors' panicked at 'assertion failed: `(left == right)`
  left: `Some([0.016142428043963264, 0.3308523413774229, 0.08940400251818637])`,
 right: `Some([0.12212996229489037, 0.18978760533844952, 0.07978876040511107])`', rstar/src/algorithm/nearest_neighbor.rs:540:13

While min_max_dist_2 could be further changed to something even more identical to the results of length_2 (with #40 it differs from length_2 only in the order the final additions occur, but this is already enough for small discrepancy due to non-associativity), the underlying issue remains that distances generated different ways (such as through Envelope::distance_2 vs PointExt::distance_2) are compared. While returning different neighbors that differ only by an epsilon is not a problem, comparisons elsewhere will be.

One solution could be adding an epsilon associated const or method to RTreeNum, which distance_2_if_less_or_equal and other hard thresholding comparisons can use. For int types this would be Zero::zero and for floats it could be their associated ::EPSILON. How does that sound?

aschampion commented 4 years ago

For now a small change to #40 makes it identical to its length_2 results before #35 while retaining most of the performance there, so I will push that amended change to #40 later this weekend. That prevents most of the discrepancy issues between different nearest neighbor methods.

georust / rstar