georust / rstar

R*-tree spatial index for the Rust ecosystem
https://docs.rs/rstar
Apache License 2.0
384 stars 67 forks source link

Allow 1D `RTree`s #169

Open grovesNL opened 1 month ago

grovesNL commented 1 month ago

This PR removes the parameter validation to allow 1D RTrees to be created.

1D RTrees already seem to work fine as far as I can tell from a simple example I just tested. I've looked through the places where we reference "dimensions" but most of them just loop from 0..dimensions. We even explicitly support single dimensions in 1D tuples and some other places.

Motivation described in #111

Edit: actually it looks like there are some places where it's returning false positives, I'll take a look The false positives were due to #170

michaelkirk commented 1 month ago

Edit: actually it looks like there are some places where it's returning false positives, I'll take a look

Ah, I just saw your edit.

adamreichold commented 1 month ago

I don't think the point of not allowing one-dimensional geometry that it would not work, but rather that R trees are considerably over-engineered for the one-dimensional problem. I think the closest analogue is an interval tree which we for example use for time range queries (where we use an R tree for spatial queries).

I do think that all operations we provide here collapse to the interval tree version for the one-dimensional case, but it seems questionable to use this instead of providing a proper interval tree structure. Consider for example distance computations where we would uselessly compute squares or sometimes even worse square roots when we need the distance instead of the squared distance which would just collapse to computing the absolute value for the interval case.

The main conceptual difference here I think is that basically all nuances around R trees come from having to chose a single dimension each time the search space is to split into smaller parts. A problem which is completely absent when using a single dimension.

grovesNL commented 1 month ago

@adamreichold Agreed, I'd like to use it for the exact use case of implementing an interval tree. I've evaluated most of the existing interval trees/range maps/segment tree crates but haven't been able to find anything suitable (usually cache-inefficient or inflexible).

My thought was that we could try to specialize the 1D RTree with traits as much as possible (beyond this initial support) to avoid some of the distance tests we need for higher dimensions. It's also totally fair to say we don't want support that complexity here and keep it restricted it to 2D or higher.

RTree already has most of the operations I'd like for the 1D case, so I could copy most of the implementation and remove all of the higher dimension functionality to specialize it, but I was hoping there was a way to share most of the implementation.

adamreichold commented 1 month ago

My thought was that we could try to specialize the 1D RTree with traits as much as possible (beyond this initial support) to avoid some of the distance tests we need for higher dimensions. It's also totally fair to say we don't want support that complexity here and keep it restricted it to 2D or higher.

I ran into this as well and my take away was that implementing the one-dimensional case from the ground up as an interval is indeed simpler than working to specialize the n-dimensional case trying to achieve similar efficiency.

(I maintain my own little interval tree called sif-itree so I can memory map it directly, so for me that did solve the flexibility issue because I can implement exactly what I need which is just range queries. I guess PR to improve existing crates if you do not have the memory mapping use case would also be viable. I started out along those lines for example here.)

grovesNL commented 1 month ago

I don't think it would be too difficult to specialize 1D here, but I'm ok with writing my own outside rstar.

For what it's worth I'd also like to eventually specialize 2D calculations too (e.g., SIMD overlap tests on a specialized 2D-only AABB) while sharing the same tree structure, selection iterators, etc.

adamreichold commented 1 month ago

For what it's worth I'd also like to eventually specialize 2D calculations too (e.g., SIMD overlap tests on a specialized 2D-only AABB) while sharing the same tree structure, selection iterators, etc.

I think this should be possible by reimplementing the traits defined by this crate and in the 2D case this should also not impose any infrastructure cost that is not actually necessary, in contrast to the one-dimensional case. (I also think we could carry those two-dimensional reimplementations upstream here because 2D is likely a very common case in GIS.)

grovesNL commented 1 month ago

It looks like the false positives I noticed were due the from_points issue I split off into #170, so this seems to be working fine for 1D.

#[derive(Debug)]
#[repr(transparent)]
struct IntervalAabb(AABB<(i64,)>);

impl RTreeObject for IntervalAabb {
    type Envelope = AABB<IntervalPoint>;

    fn envelope(&self) -> Self::Envelope {
        self.0
    }
}

let envelopes = vec![
    IntervalAabb(AABB::from_corners((3,), (4,))),
    IntervalAabb(AABB::from_corners((0,), (1,))),
    IntervalAabb(AABB::from_corners((10,), (12,))),
];

let tree = RTree::bulk_load(envelopes);

for x in tree.locate_in_envelope_intersecting(&AABB::from_corners((4,), (11,))) {
    println!("{:?}", x);
}

// Outputs:
// IntervalAabb(AABB { lower: (10,), upper: (12,) })
// IntervalAabb(AABB { lower: (3,), upper: (4,) })

Happy to close this PR if we'd prefer to keep 1D disabled. Either way it looks like rstar is already set up to handle 1D so it might be nice to expose it for use cases like this.