geocaml / ocaml-rtree

A pure OCaml implementation of R-Trees
BSD 3-Clause "New" or "Revised" License
26 stars 7 forks source link

Suitable for this problem? #13

Closed lindig closed 11 months ago

lindig commented 11 months ago

I am looking for a library for the following problem and it looks like it is a fit but I would value your opinion:

The problem is to do this efficiently when there are many landmarks, most of which are not relevant. Using the library I could create envelopes for each segment of the path and small square envelopes for each landmark. Now I am only interested in landmarks where the envelope of the landmark intersects with an envelope of a segment.

patricoferris commented 11 months ago

Hi @lindig, thanks for the interest in rtree!

This sounds like a good fit for Rtree. I think I would put the presumably fairly static landmarks into an index and then fold over your path. Something like

let () = Random.init 42

module Line = struct
  type t = { p0 : float * float; p1 : float * float }

  let envelope { p0 = (x1, y1); p1 = (x2, y2) } =
    let x0 = Float.min x1 x2 in
    let x1 = Float.max x1 x2 in
    let y0 = Float.min y1 y2 in
    let y1 = Float.max y1 y2 in
    Rtree.Rectangle.v ~x0 ~y0 ~x1 ~y1
end

module Path = struct
  type t = Line.t list
end

module Landmark = struct
  type t = { x : float; y : float } [@@deriving repr]

  let random ?(w = 180.) ?(h = 180.) () =
    let x = Random.float (2. *. w) -. w in
    let y = Random.float (2. *. h) -. h in
    { x; y }

  type envelope = Rtree.Rectangle.t

  let envelope p =
    let expand = 1.0 in
    let x0 = p.x -. expand in
    let x1 = p.x +. expand in
    let y0 = p.y -. expand in
    let y1 = p.y +. expand in
    Rtree.Rectangle.v ~x0 ~y0 ~x1 ~y1
end

module Landmark_index = Rtree.Make (Rtree.Rectangle) (Landmark)

let landmarks =
  List.init 1_000_000 (fun _ -> Landmark.random ())

let () =
  let path : Path.t = [ Line.{ p0 = (0., 0.); p1 = (1., 1.)}; Line.{ p0 = (1., 1.); p1 = (2., 2.)} ] in
  let idx = Landmark_index.load ~max_node_load:16 landmarks in
  let collect =
    List.fold_left (fun acc v -> Landmark_index.find idx (Line.envelope v) :: acc) [] path
    |> List.concat
  in
  Fmt.pr "Found: %a" Fmt.(list (Repr.pp Landmark.t)) collect

Of course how you choose to create envelopes is probably quite nuanced (does it take into account the topography of the place for example or how projections might impact this too).

lindig commented 11 months ago

Thanks for the detailed suggestion. Indeed, "passing a landmark" in my mind is a bit more complicated but the goal is to find candidate landmarks first - which this approach would provide. My idea for passing a landmark is to use vector algebra:

lindig commented 11 months ago

As a motivation, this is a problem Strava needs to solve: athletes are running/biking along paths that they upload. Strava has so-called segments defined between two landmarks and reports for each run/track all the segments passed with the time it took. So they need a quick way to limit the landmarks to consider.

lindig commented 11 months ago

Running the code above in utop results in a stack overflow; it does work with a lower number of landmarks, though.

top # let idx = Landmark_index.load ~max_node_load:16 landmarks  ;;
Stack overflow during evaluation (looping recursion?).
 let rec omt ~m entries =
    if List.length entries <= m then
      let leaves = List.map (fun v -> (V.envelope v, v)) entries in
      Leaf leaves
    else
      let slices = number_along_axis ~m (List.length entries) in
      let q =
        let q' = Queue.create () in
        Queue.add (entries, E.dimensions - 1) q';
        q'
      in
      let n = partitioning_task ~slices ~m q in
      Node n

This code takes the length of the landmarks twice; List.length is tail recursive so not responsible for the problem. The code needs to be reviewed for tail recursion. It includes operations on potentially large lists that are not tail recursive: List.map for example.

patricoferris commented 11 months ago

Ah thanks for the digging @lindig, very helpful, hopefully https://github.com/geocaml/ocaml-rtree/pull/14 fixes this.

lindig commented 11 months ago

Thanks for the answer; closing the issue.