Closed lindig closed 11 months ago
Hi @lindig, thanks for the interest in rtree!
This sounds like a good fit for Rtree. I think I would put the presumably fairly static landmarks into an index and then fold over your path. Something like
let () = Random.init 42
module Line = struct
type t = { p0 : float * float; p1 : float * float }
let envelope { p0 = (x1, y1); p1 = (x2, y2) } =
let x0 = Float.min x1 x2 in
let x1 = Float.max x1 x2 in
let y0 = Float.min y1 y2 in
let y1 = Float.max y1 y2 in
Rtree.Rectangle.v ~x0 ~y0 ~x1 ~y1
end
module Path = struct
type t = Line.t list
end
module Landmark = struct
type t = { x : float; y : float } [@@deriving repr]
let random ?(w = 180.) ?(h = 180.) () =
let x = Random.float (2. *. w) -. w in
let y = Random.float (2. *. h) -. h in
{ x; y }
type envelope = Rtree.Rectangle.t
let envelope p =
let expand = 1.0 in
let x0 = p.x -. expand in
let x1 = p.x +. expand in
let y0 = p.y -. expand in
let y1 = p.y +. expand in
Rtree.Rectangle.v ~x0 ~y0 ~x1 ~y1
end
module Landmark_index = Rtree.Make (Rtree.Rectangle) (Landmark)
let landmarks =
List.init 1_000_000 (fun _ -> Landmark.random ())
let () =
let path : Path.t = [ Line.{ p0 = (0., 0.); p1 = (1., 1.)}; Line.{ p0 = (1., 1.); p1 = (2., 2.)} ] in
let idx = Landmark_index.load ~max_node_load:16 landmarks in
let collect =
List.fold_left (fun acc v -> Landmark_index.find idx (Line.envelope v) :: acc) [] path
|> List.concat
in
Fmt.pr "Found: %a" Fmt.(list (Repr.pp Landmark.t)) collect
Of course how you choose to create envelopes is probably quite nuanced (does it take into account the topography of the place for example or how projections might impact this too).
Thanks for the detailed suggestion. Indeed, "passing a landmark" in my mind is a bit more complicated but the goal is to find candidate landmarks first - which this approach would provide. My idea for passing a landmark is to use vector algebra:
As a motivation, this is a problem Strava needs to solve: athletes are running/biking along paths that they upload. Strava has so-called segments defined between two landmarks and reports for each run/track all the segments passed with the time it took. So they need a quick way to limit the landmarks to consider.
Running the code above in utop results in a stack overflow; it does work with a lower number of landmarks, though.
top # let idx = Landmark_index.load ~max_node_load:16 landmarks ;;
Stack overflow during evaluation (looping recursion?).
let rec omt ~m entries =
if List.length entries <= m then
let leaves = List.map (fun v -> (V.envelope v, v)) entries in
Leaf leaves
else
let slices = number_along_axis ~m (List.length entries) in
let q =
let q' = Queue.create () in
Queue.add (entries, E.dimensions - 1) q';
q'
in
let n = partitioning_task ~slices ~m q in
Node n
This code takes the length of the landmarks twice; List.length
is tail recursive so not responsible for the problem. The code needs to be reviewed for tail recursion. It includes operations on potentially large lists that are not tail recursive: List.map
for example.
Ah thanks for the digging @lindig, very helpful, hopefully https://github.com/geocaml/ocaml-rtree/pull/14 fixes this.
Thanks for the answer; closing the issue.
I am looking for a library for the following problem and it looks like it is a fit but I would value your opinion:
The problem is to do this efficiently when there are many landmarks, most of which are not relevant. Using the library I could create envelopes for each segment of the path and small square envelopes for each landmark. Now I am only interested in landmarks where the envelope of the landmark intersects with an envelope of a segment.