R* Tree insertion performance - Githubissues

Stoeoef / spade

Delaunay Triangulations for the Rust Ecosystem

Apache License 2.0

272 stars 48 forks source link

R* Tree insertion performance #16

Closed urschrei closed 6 years ago

urschrei commented 7 years ago

I'm wondering whether you've benchmarked insertion performance when populating the R* Tree; We're currently using Spade in a couple of places in rust-geo (it's working perfectly, thank you!), and our primary applications involve one-time bulk-inserting line segments into the tree. It's hard to know without comparing to other libraries, but performance seems slower than expected; I see times around 90 ms for inserting segments built from ~9k points. If you have any ideas for improving this, or ideas for improving insertion performance in the library, I'd be interested in helping out with development.

Stoeoef commented 6 years ago

Thanks for the input! It's great to hear that spade has found usage in another project! Historically, I've created the r-tree first, then noticed that a Delaunay Triangulation fits my own purposes way better and focused on the development of the triangulation. That's why I have not yet tried to fully optimize the R-tree. However, R-Trees are intrinsically slow compared to many other algorithms as they often require many random memory accesses, thus I'm not sure on how fast I can make them.

Here are some ideas that might help you in your scenario:

Sorting the points prior to insertion with a spatial ordering like z-order supposedly improves performance as it should result in fewer cache misses. I'm curious if this works, I'll create a test case and share my results
The performance of spatial retrieval algorithms heavily depends on the specific use case. The major strength of R-Trees is their flexibility, meaning that they can be modified after some data has been inserted without the need to rebuild the whole structure. If you do not need this flexibility in your application, other algorithms - like kd-trees or quadtrees - might perform better. Would this be an option in your case? There should be some existing implementations of kd-trees for rust.
Inserting many points at once (Something like RTree::insert_list(Vec<Line>) could probably be sped up significantly, I'll have to think through that. If I come up with a good idea, I'll let you know. If you're still interested in helping out, this might be a good opportunity.

Stoeoef commented 6 years ago

I've tried out the z-order approach which was surprisingly easy thanks to the morton crate. It does increase performance significantly, but only for a very large number of points. In my test cases, only after inserting about ~50000 points it turned out to be faster. So, to address your problem, assuming that you have around 9000k line segments that are inserted into the tree, the presorted insertion might not yield any speedup. But since I want to implement it anyway, you can easily give it a try once it is there. Additionally, all my tests and optimizations so far were designed to improve the insertion of points. There might be some optimizations that only apply when inserting lines, I'll keep this issue updated once I know more.

urschrei commented 6 years ago

Thank you for working on this! We actually have two specific use cases (one current, one will land when I finish up the work and will be far more widely used) for R* trees in rust-geo:

We use the tree to store line segments, in order to quickly test for intersections, giving us a validity-preserving line-simplification algorithm (similar to here). This functionality does require new segments to be inserted, so I think the R* tree is appropriate. Also, Now that you've implemented PartialEq, for SimpleEdge, I should be able to remove some extra calls to remove edges, which will definitely give some speedup. The edges that are initially inserted are sorted (they're polygon ring segments), but I wasn't aware of z-order sorting.
The second case also involves up-front bulk insertion of segments, in order to calculate the distance between two non-convex polygons, by loading all segments from Polygon A into a tree, finding the closest segment for all points in Polygon B, then doing the opposite, keeping the smallest point-segment distance of the two runs as the minimum polygon distance. It's a brute-force approach, but the R* tree cuts down the processing time a lot. I know Rbush uses a combination of OMT and Floyd-Rivest selection for point insertion, but I don't know whether those approaches are useful for segments.

I should probably try to run some detailed benchmarks against Rbush, and also libspatialindex (which also has a bulk-insertion mode), but the last time I checked, both were around an order of magnitude faster (excepting the possibility that my Rust code was really inefficient in some other way, but I was pretty careful)

Stoeoef commented 6 years ago

I've tried to compare spade with libspatialindex, so far spade seemed to outperform libspatialindex. However, maybe I'm not using libspatialindex correctly, or its parameters are not set up ideally. Maybe my test dataset (randomly generated, sometimes intersecting lines) was not chosen carefully enough, so this comparison should be taken with a grain of salt. I didn't manage to create a comparison with RBush yet, but given that they use specialised algorithms for insertion I believe it should be faster. I didn't read the papers about these algorithms yet, that'll be the next step.

Stoeoef commented 6 years ago

I've implemented the OMT-thingy, you'll find it on the master branch. Try it out with RTree::bulk_load(vec_with_data); This is only bulk-loading, bulk insertion (= insert many elements into a non empty tree) is still an issue. Still, you should see a significant speedup in your second scenario.

urschrei commented 6 years ago

Amazing, I'll try it out over the weekend. Thank you!

urschrei commented 6 years ago

I need to break down the exact speedup, but using bulk_load reduces the time on my benchmark by around 55%. This is great!

Stoeoef commented 6 years ago

I have compared the performance of spade with rbush (running locally on the same machine). Spade ran sequential insertions about 1.4 times faster than rbush, bulk loading ran about 2.7 times faster. Just make sure that you have rust's optimizations enabled, they make a huge difference: The optimizations speed up the sequential insertion about 150 times, bulk load is improved about 190 times. I think I'm done optimizing sequential insertion and bulk loading for now, I'll create a new issue for implementing bulks insertion - the available algorithms seem to be either suboptimal or more complex and can be covered there more in detail.