georust / rstar

R*-tree spatial index for the Rust ecosystem
https://docs.rs/rstar
Apache License 2.0
386 stars 67 forks source link

Remove Copy bound on Point and Envelope? #103

Closed Andlon closed 1 year ago

Andlon commented 1 year ago

I have a use case in fenris where I'm working on heavily generic code based on nalgebra's typenum based traits, and I'd like to use rstar for some spatial queries. Since my use of rstar here is supposed to be an implementation detail, I cannot add trait bounds, and for complex reasons related to deficiencies of the Rust type system, I can't make it work with the current Point: Copy and Envelope: Copy bounds. I was curious to see how necessary it is, and it turns out that these bounds can be removed without any breaking changes (as far as I can tell) by sprinkling some .clone() calls in various places.

Would it be interesting for you to lift the Copy bound? If not I might have to maintain my own fork just for fenris, which I'd rather like to avoid, but I can understand if you don't want to remove the bound.

Below is the output of cargo bench, by first running it on the current master branch, and afterwards on this branch. I tried to run it under the exact same conditions by shutting down most running background applications. For some reason one of the benchmarks reports being 6% faster, which I must still attribute to noise or random optimizer shenanigans, as I don't see how the changes I've made should make anything faster.

     Running benches/benchmarks.rs (target/release/deps/benchmarks-78cc23f600fbf43e)
Bulk load baseline      time:   [140.61 µs 140.64 µs 140.67 µs]                               
                        change: [-0.1287% -0.0943% -0.0606%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Benchmarking rstar and spade benchmarks/rstar sequential: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.1s, enable flat sampling, or reduce sample count to 60.
rstar and spade benchmarks/rstar sequential                                                                             
                        time:   [1.1893 ms 1.1895 ms 1.1896 ms]
                        change: [-2.1223% -2.0942% -2.0670%] (p = 0.00 < 0.05)
                        Performance has improved.

bulk load quality       time:   [129.00 µs 129.06 µs 129.13 µs]                              
                        change: [-0.7067% -0.6304% -0.5579%] (p = 0.00 < 0.05)
                        Change within noise threshold.

sequential load quality time:   [177.29 µs 177.31 µs 177.34 µs]                                    
                        change: [+0.8951% +0.9252% +0.9522%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  5 (5.00%) high mild
  1 (1.00%) high severe

locate_at_point (successful)                                                                            
                        time:   [185.26 ns 185.71 ns 186.14 ns]
                        change: [-6.5996% -6.3658% -6.1379%] (p = 0.00 < 0.05)
                        Performance has improved.

locate_at_point (unsuccessful)                                                                            
                        time:   [313.00 ns 313.34 ns 313.68 ns]
                        change: [-0.2903% -0.1584% -0.0239%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

     Running unittests src/main.rs (target/release/deps/rstar_demo-a0db318fb3a605e3)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
rmanoka commented 1 year ago

Thanks for the PR @Andlon . Typically the point or similar small types are kept Copy to prevent the clutter with lots of clone() calls. This is a matter of preference, so we should hear from other maintainers / authors too before talking a call. Would be good if you could provide more details on how the Copy bound inhibits currently.

michaelkirk commented 1 year ago

I'm personally ok with removing the Copy bounds, but it'd be good to hear from other voices.

Andlon commented 1 year ago

Well, I've got a couple of use cases:

Arbitrary/exact precision. For example using a BigRational type for coordinates. I'm currently getting around this by converting to f64 and significantly enlarging all bounding boxes to hopefully account for rounding errors. I'm only using rstar for finding candidate bounding boxes here, I don't put the actual geometries in the tree itself. So I think this isn't too important overall.

The second is related to nalgebra and typenum. Imagine a function like this:

fn find_closest_cell<D: DimName>(mesh: &Mesh<f64, D>, point: &OPoint<f64, D>)
where
    // This is an nalgebra-specific thing. There's unfortunately no way to prove to the compiler at present that this bound
    // is redundant.
    DefaultAllocator: Allocator<f64, D>
{
    // Do something with rstar
}

Now, OPoint<f64, D> is not Copy unless I explicitly ask for either OPoint<f64, D>: Copy or <DefaultAllocator as Allocator<f64, D>>::Buffer: Copy, neither of which I can do because I'm not willing to propagate these trait bounds everywhere. Therefore, because OPoint<f64, D> is not copy, I also cannot use rstar in a dimension-generic setting.

These problems go away if you use const generics, that is Point<T, D> where const D: usize is a generic parameter. However, the problem I have is that my dimensions D come from an associated type, e.g.

trait Trait {
    type Dimension: DimName;
}

since there is no way to use associated constants with const generics at present, this is not an option for me, and I have to stick to the typenum-based API that nalgebra offers, with the aforementioned restrictions.

rmanoka commented 1 year ago

Supporting arbitrary precision types indeed sounds like a good benefit. The future downside is a few more clone() calls, but doesn't sound too bad since it's incremental.

@urschrei @Stoeoef or others, any comments? I'm inclined to merge this unless you are against this.

@Andlon Pls. add a change-log entry; I guess this is a breaking change.

adamreichold commented 1 year ago

My expectation here would be that this optimizes to the same machine copy when the actual types implementing Point and Envelope are Copy. I would also expect that the Clone invocations are only a burden in the generic code in this crate's implementation, not for its users when they actually use Copy types.

Andlon commented 1 year ago

@rmanoka: sorry, I should clarify. This PR doesn't by itself enable use of arbitrary types directly, although it is a necessary requirement that both Point and Envelope be non-Copy, so it is a step towards that. However, we still have RTreeNum: Copy + Bounded. Both Copy and Bounded cannot be implemented by arbitrary precision types.

I realize this information might change your inclination to accept the PR. I'll update the changelog anyway so that it would be ready to be merged if you all decide for it.

I believe it should not be a breaking change, as far as I can tell, since we are only removing a trait bound and otherwise making no changes to function/method signatures. Would appreciate a review to check this claim though, I'm not at all familiar with the code base and might have missed something.

Btw, is there no CI running on PRs...? I don't see any.

My expectation here would be that this optimizes to the same machine copy when the actual types implementing Point and Envelope are Copy. I would also expect that the Clone invocations are only a burden in the generic code in this crate's implementation, not for its users when they actually use Copy types.

@adamreichold: that's my line of thinking too. Optimizations are always a bit finnicky and unpredictable, but from experience the compiler is able to inline the .clone() as long as short functions are inlined/inlineable, which in rstar they'll be because they're basically all generic.

urschrei commented 1 year ago

bors try

bors[bot] commented 1 year ago

try

Build succeeded:

rmanoka commented 1 year ago

@Andlon Thanks for the clarification, I'm still inclined to merge as it's a step towards it.

@urschrei pls merge if happy with the CI. Or I'll merge it over the weekend

rmanoka commented 1 year ago

@Andlon I think it's breaking as the traits are public and eg. usage of copy on T: Point will now fail to compile. Pls. mark the change-log entry as such.

Andlon commented 1 year ago

@rmanoka: yeah, you're of course totally right. I forgot that someone other than rstar might be writing generic code with the Point trait. I'll update the changelog next chance I get - from a quick look it seems as if I just mark the change as "BREAKING". Correct me if wrong!

rmanoka commented 1 year ago

bors r+

@Andlon Yes; took the liberty of making the changes. Merging. Thanks again for the PR!

bors[bot] commented 1 year ago

Build succeeded:

Andlon commented 1 year ago

Thanks @rmanoka!

culebron commented 1 year ago

I wonder if this is the reason why simple code now fails to compile:

use geo::{Point, Translate};
use rstar::{RTreeObject, AABB};

#[derive(Debug, Clone)]
pub struct Stop {
    pub id: u32,
    pub geom: Point<f32>,
    pub buff: f32,
}

impl RTreeObject for Stop {
    type Envelope = AABB<Point<f32>>;
    fn envelope(&self) -> Self::Envelope {
        let p1 = self.geom.translate(-self.buff, -self.buff);
        let p2 = self.geom.translate(self.buff, self.buff);
        AABB::from_corners(p1, p2)
    }
}

Compiles with v0.9, fails since v0.10:

fn envelope(&self) -> Self::Envelope {
                      ^^^^^^^^^^^^^^ the trait `rstar::Point` is not implemented for `geo::Point<f32>`

Treating this literally, I can't fix it at all, because I own neither Point, nor rstar::Point, they're from other crates.

And from the discussion, I don't see how to fix it. Removing Copy bound seems to relax requirements, so things should have still worked, but they don't.

urschrei commented 1 year ago

That error usually means a version mismatch somewhere. What versions of rstar and geo/geo-types are you using?

adamreichold commented 1 year ago

That error usually means a version mismatch somewhere. What versions of rstar and geo/geo-types are you using?

I think we should diagnose the version issues in #128.