georust / rstar

R*-tree spatial index for the Rust ecosystem
https://docs.rs/rstar
Apache License 2.0
415 stars 68 forks source link

Performance degradation following 0.12.1 release #183

Closed chingiztob closed 3 weeks ago

chingiztob commented 3 weeks ago

After upgrading the Rstar crate from version 0.12 to 0.12.1, I've experienced a dramatic slowdown in performance when using .nearest_neighbor in my public transit graph application. The operation, which connects all transit stops to the nearest street nodes, has slowed from ~50ms to over 30 seconds. All geometries are f64 geo-types points.

-- Expected Performance: ~2-5μs per .nearest_neighbor call (as observed with version 0.12) -- Observed Performance: 10-20ms per .nearest_neighbor call with version 0.12.1 -- Change: Only the Rstar crate version was updated (from 0.12 to 0.12.1).

Cargo Flamegraph profiling didn’t pinpoint the issue due to heavy interference from Polars-related calls. flamegraph_0_12

pub type IndexedPoint = GeomWithData<Point, Option<NodeIndex>>;

fn find_nearest_point_and_calculate_distance(
    point: &Point,
    tree: &RTree<IndexedPoint>,
) -> Result<(NodeIndex, f64), Error> {
    let instant = std::time::Instant::now();
    if let Some(nearest_point) = tree.nearest_neighbor(point) {
        println!("This took {:?}", instant.elapsed());
        let distance = Haversine::distance(*point, *nearest_point.geom()) / WALK_SPEED;
        let node = nearest_point.data.ok_or_else(|| {
            Error::NodeNotFound(format!("Nearest node not found for point {point:?}"))
        })?;
        Ok((node, distance))
    } else {
        Err(Error::NodeNotFound(format!(
            "Nearest node not found for point {point:?}"
        )))
    }
}

pub(crate) fn build_rtree(graph: &DiGraph<GraphNode, GraphEdge>) -> RTree<IndexedPoint> {
    let index_geo_vec: Vec<IndexedPoint> = graph
        .node_indices()
        .map(|node| {
            let node_data = graph.node_weight(node).unwrap();
            let node_point: Point = *node_data.geometry();
            IndexedPoint::new(node_point, Some(node))
        })
        .collect();

    RTree::bulk_load(index_geo_vec)
}

/// Connects Transit nodes (stops) to the nearest walk nodes.
pub(crate) fn connect_stops_to_streets(graph: &mut TransitGraph) -> Result<(), Error> {
    let instant = std::time::Instant::now();
    let rtree: RTree<IndexedPoint> = graph.rtree_ref().unwrap().clone();

    for node in graph.node_indices() {
        // check if there is already a transfer edge
        // This is required to avoid creating duplicate transfer edges
        // when merging multiple transit graphs on top of main street graph
        // (Work in progress)
        if graph
            .edges(node)
            .any(|edge| matches!(edge.weight(), GraphEdge::Transfer(_)))
        {
            continue;
        }

        let weight = graph
            .node_weight(node)
            .ok_or_else(|| Error::MissingValue("Node weight not found".to_string()))?;

        if let GraphNode::Transit(_) = weight {
            if let Ok((nearest_point_index, distance)) =
                find_nearest_point_and_calculate_distance(weight.geometry(), &rtree)
            {
                let edge = GraphEdge::Transfer(WalkEdge {
                    edge_weight: distance,
                });

                graph.add_edge(node, nearest_point_index, edge.clone());
                graph.add_edge(nearest_point_index, node, edge);
            }
        }
    }
    println!("This took {:?}", instant.elapsed());
    Ok(())
}
adamreichold commented 3 weeks ago

Just going out on a hunch, but could you test with a Git dependency using the branch from https://github.com/georust/rstar/pull/181 ? Thanks!

chingiztob commented 3 weeks ago

Just going out on a hunch, but could you test with a Git dependency using the branch from #181 ? Thanks!

that one?

rstar = { git = "https://github.com/michaelkirk/rstar.git", rev = "222b01a40953c4fae35b38fc7d19416a29136818"}
michaelkirk commented 3 weeks ago

no, 222b01a40953c4fae35b38fc7d19416a29136818 doesn't change any behavior, it just adds a failing unit test.

https://github.com/georust/rstar/pull/181/commits/12836a98fc8e04e9c0668cb4c4f273f47e430697 changes behavior.

Otherwise, the 0.12.0 to 0.12.1 is only a few commits - you could try doing a git bisect to chase down the culprit.

adamreichold commented 3 weeks ago

You could also try

rstar = { git = "https://github.com/georust/rstar.git", branch = "new-empty-min-max" }
adamreichold commented 3 weeks ago

Otherwise, the 0.12.0 to 0.12.1 is only a few commits - you could try doing a git bisect to chase down the culprit.

But yeah, this would certainly be the most thorough solution.

chingiztob commented 3 weeks ago

Otherwise, the 0.12.0 to 0.12.1 is only a few commits - you could try doing a git bisect to chase down the culprit.

But yeah, this would certainly be the most thorough solution.

idk why, but i get compiler errors when trying to use git dependency

error[E0277]: the trait bound `rstar::primitives::GeomWithData<geo::Point, std::option::Option<petgraph::prelude::NodeIndex>>: rstar::RTreeObject` is not satisfied
   --> cascade-core/src/connectors.rs:62:12
    |
62  |     tree: &RTree<IndexedPoint>,
    |            ^^^^^^^^^^^^^^^^^^^ the trait `rstar::RTreeObject` is not implemented for `rstar::primitives::GeomWithData<geo::Point, std::option::Option<petgraph::prelude::NodeIndex>>`
    |
    = help: the trait `rstar::RTreeObject` is implemented for `rstar::primitives::GeomWithData<R, T>`
note: required by a bound in `rstar::RTree`
   --> /home/chingiz/.cargo/git/checkouts/rstar-526d45c905f9146e/4c16830/rstar/src/rtree.rs:167:8
    |
164 | pub struct RTree<T, Params = DefaultParams>
    |            ----- required by a bound in this struct
...
167 |     T: RTreeObject,
    |        ^^^^^^^^^^^ required by this bound in `RTree`
urschrei commented 3 weeks ago

Possibly because you have to use the same dependency in the geo and geo-types Cargo.toml?

michaelkirk commented 3 weeks ago

You should be overiding using a [patch] section in your Cargo.toml - see https://doc.rust-lang.org/cargo/reference/overriding-dependencies.html

chingiztob commented 3 weeks ago

no, 222b01a doesn't change any behavior, it just adds a failing unit test.

12836a9 changes behavior.

Otherwise, the 0.12.0 to 0.12.1 is only a few commits - you could try doing a git bisect to chase down the culprit.

Seems like perf drops after 84d1265. With rstar = { git = "https://github.com/georust/rstar.git", branch = "new-empty-min-max" } things are back to normal