georust / rstar

R*-tree spatial index for the Rust ecosystem
https://docs.rs/rstar
Apache License 2.0
410 stars 68 forks source link

Q: Non-euclidean space primitives #70

Open 0x7CFE opened 3 years ago

0x7CFE commented 3 years ago

Hello, I'd like to know if its possible to adapt the crate to non-euclidean spaces.

For instance I'm interested in using rstar to speed up search operations on the following space:

On these pictures you can see a point cloud that represents radius vectors. I need to have a way to quickly enumerate all points within a certain bounding box defined by angles and distances. Depending on settings it can look almost as a square or as a thin sector. So, reducing search space by a simple bounding circle is not always efficient.

So I was wondering, if its possible to use something like (r, theta) as a primitive box to be directly used inside rstar? One issue I see is that in my case angles form a continious space, so a box near 0° would include some points in the 0°.. 5° and 355°..359° ranges.

Thanks in advance!

michaelkirk commented 3 years ago

So I was wondering, if its possible to use something like (r, theta)

It looks like you'd have a third param, based on your first example, since that query seems to start at some distance from the origin, right?

Am I correct that you're querying for points? Or are there features with 2d/3d (lines/polygons)?

Unfortunately, I don't have a very specific answer for you, but for querying purposes, would it be reasonable to project your data into some kind of euclidean space, then handle as a special case a splitting of queries that cross the axis?

...easier said than done, I know.

adamreichold commented 3 years ago

An easy option if you are interested only in point clouds and have a uniform upper bound for the distance of your queries would be to add the points twice, i.e. points close to 360° at 360°-x would also be indexed at -x and points close 0° at x would also be indexed at 360°+x. You then collapse these "mirror points" back to the original points by normalising the angular coordinates of the query results. (This can be generally useful for handling periodic boundary conditions.) (And if you are only using this for point clouds, you might want to look into using a K-D tree instead of an R tree for sheer simplicity.)

Other than the above, I had no issues with using (r, theta) as two-dimensional coordinates of an R tree. Distance is computed using the law of cosines and the angular bounds of the AABB depend on the radius.

0x7CFE commented 3 years ago

It looks like you'd have a third param, based on your first example, since that query seems to start at some distance from the origin, right?

Oh, indeed. Sorry for misleading. Yes, in my case I'm interested in a bounding region (Rmin, Rmax, Amin, Amax) where R is radius and A is angle. So, in a sense it's similar to an AABB but in a curved space.

Am I correct that you're querying for points? Or are there features with 2d/3d (lines/polygons)?

Yes, points only. So in the picrures above I'm interested in enumerating highlighted points given that they belong to the query region.

0x7CFE commented 3 years ago

An easy option if you are interested only in point clouds and have a uniform upper bound for the distance of your queries would be to add the points twice, i.e. points close to 360° at 360°-x would also be indexed at -x and points close 0° at x would also be indexed at 360°+x. You then collapse these "mirror points" back to the original points by normalising the angular coordinates of the query results. (This can be generally useful for handling periodic boundary conditions.) (And if you are only using this for point clouds, you might want to look into using a K-D tree instead of an R tree for sheer simplicity.)

Oh, interesting. Yet, if I understood correctly, that would require adding nearly half of the points twice, since the maximum angle span in my case is 180°. In that case I should probably just use a circle as a bound in euclidean space.

On a second thought, instead of adding points twice, maybe we can do the query twice (above and below 0°) and then just merge the results?

Other than the above, I had no issues with using (r, theta) as two-dimensional coordinates of an R tree. Distance is computed using the law of cosines and the angular bounds of the AABB depend on the radius.

Well yes, I was thinking about something like this.

adamreichold commented 3 years ago

On a second thought, instead of adding points twice, maybe we can do the query twice (above and below 0°) and then just merge the results?

Certainly. The code for doing that ended up slightly more complex in my particular case and if you do having the memory to spare, then adding additional points was also faster IIRC. (I think this is 2logN versus log2N or something like that.)

A more principled approach to handling periodic boundary conditions and thereby also angular coordinates would be PerioR-Tree, i.e. issue #44.