mapbox / vtquery

Query some gosh darn vector tiles
BSD 2-Clause "Simplified" License
44 stars 15 forks source link

Filter by geometries by bbox #55

Open springmeyer opened 6 years ago

springmeyer commented 6 years ago

I ran the slowest/most intense benchmark (query: all things - dense nine tiles) on OS X using node bench/vtquery.bench.js --iterations 5000 --concurrency 1 and a patch to disable all other tests besides query: all things - dense nine tiles. Then I profiled in Activity Monitor during the run.

What I see is:

My interpretation is that:

And therefore our overwhelming bottleneck (where > 50% of the time is taken) is mapbox::vector_tile::extract_geometry (https://github.com/mapbox/vector-tile/blob/97d8b89fe635f117ce7de25790028a65d9ce5172/include/mapbox/vector_tile.hpp#L15)

So, to reduce the latency of scenarios like this (large radius and multiple tiles) we'll need to speed up mapbox::vector_tile::extract_geometry.

Profiling output: https://callgraph.herokuapp.com/76849341d35452543e35c504964dcb94#thread-6

/cc @mapsam @flippmoke

springmeyer commented 6 years ago

Per the title, one idea to solve the mapbox::vector_tile::extract_geometry bottleneck is to avoid calling it for geometries outside the radius. To do this you could:

This would lead to the vtzero handlers getting called twice, so there would be that added cost. But my hope/hunch is that this cost would be small and worth it. Thoughts?

mapsam commented 6 years ago

Hey @springmeyer just to confirm, were these run on master or dedupe branch?

springmeyer commented 6 years ago

@mapsam - dedupe

joto commented 6 years ago

Two ideas here:

  1. To save some cost from decoding the vtzero geometry twice you can return early. Instead of calculating the bbox in the first go you could, say, detect that some points of the geometry are to the left of the query point and when you find the first point that's to the right of the query point, you can stop there and directly go to the more detailed check. Makes the code more complex though.
  2. If memory allocation is the culprit here, find a way to re-use memory. You are only looking at one feature/geometry at a time, so the same memory can be reused for every geometry. Of course this doesn't work with vector-tile geometries.
springmeyer commented 6 years ago

Instead of calculating the bbox in the first go you could, say, detect that some points of the geometry are to the left of the query point and when you find the first point that's to the right of the query point, you can stop there and directly go to the more detailed check. Makes the code more complex though.

@joto - this sounds very interesting. Could you share a little more detail on how this could work? How would we be able to do these point detections without decoding the vtzero geometry twice? Are you saying that we could do these while doing the current decode somehow (perhaps within our geometry handler)? This feels a little bit similar to https://github.com/mapbox/mapnik-vector-tile/pull/146 where, per geometry part/ring we would check if that ring was outside the bbox filter and then avoid adding it to the vector containing all rings.

springmeyer commented 6 years ago

If memory allocation is the culprit here, find a way to re-use memory. You are only looking at one feature/geometry at a time, so the same memory can be reused for every geometry. Of course this doesn't work with vector-tile geometries.

Yes, I had this same thought. Currently we don't need to carry through the geometry at all - we simply create each geometry in the loop and then discard. So, unless @mapsam anticipates needing to return the geometry to the user, we could potentially do this and save significant memory allocations. I presume we'd need to, however, templatize all the vector-tile 2.x functions to accept a custom geometry type that overrides the allocator? /cc @flippmoke - ideas on how this could be done?

mapsam commented 6 years ago

So, unless @mapsam anticipates needing to return the geometry to the user

Nope!

joto commented 6 years ago

@springmeyer To do any of this you have to rid yourself of the convenience of the vector-tile functionality and use vtzero directly. This way you have access to the undecoded geometry and can decode it "manually" stopping the moment you find a point that makes it possible that the whole geometry is "near enough" to the query point. Only then you assemble the actual geometry. For points this is all very simple and you don't need vector-tile at all I would think. For more complex geometry the nearness calculation is more difficult to do, I am not sure what the best approach would be there. Maybe https://github.com/mapbox/vector-tile/issues/36 would already help here.

artemp commented 6 years ago

Might be relevant here : https://github.com/mapbox/vtcomposite/pull/55

springmeyer commented 5 years ago

More bbox filtering applied in https://github.com/mapbox/vtcomposite/pull/91 to vcomposite