mapbox / tile-reduce

mapreduce vector tile processing
ISC License
184 stars 33 forks source link

Blockers for BLR on tile-reduce #88

Closed tmcw closed 7 years ago

tmcw commented 8 years ago

Our blockers for tile-reduce replacing overpass in our workflow are:

Replacing Overpass would be a big win - it often goes down and work grinds to a halt, and we have to do a bunch of hacking to shift overpass's json into geojson after the fact. If we could switch, we could write faster and easier data extractors and pipe the data directly into tippecanoe.

wboykinm commented 8 years ago

@tmcw Is the difficulty of the second bullet documented anywhere or is it something you haven't tried yet?

anandthakker commented 8 years ago

@wboykinm A couple weekends ago I started playing around with the problem of stitching together tiled features that don't have an id property for joining. Found it to be a super interesting geometry problem, but I started feeling like it was going to be a lot of work for a pretty unlikely use case... maybe I was wrong, heh

mourner commented 8 years ago

Yes, stitching features is a much harder problem than it looks at first sight, even when you have the ids and if you assume special conditions like zero tile buffer and artifact-less clipping by the tile generator. In any case, this should be a separate GeoJSON-processing project that's out of scope for tile-reduce.

As for relations, the problem is that relations can't be represented as vector tiles as per the spec. However, we can make a tile-based index of relations and read them from file system using custom code inside the map function (@rclark has good experience with this). I don't think we should do anything for supporting the relations use case as a part of the tile-reduce core.

e-n-f commented 8 years ago

Would it help to have a tippecanoe feature that would never crop features but would copy them intact (but with potentially huge size beyond the tile extent) into every tile they overlap? That would be easy to add.

tmcw commented 8 years ago

cc @aarthykc who might also be working on the stitching-features-together issue

@mourner yep, totally appreciate that potentially neither of these things will actually be part of tile-reduce core, but I'd love if we can keep the discussion here at first to spec out what those things will be.

morganherlocker commented 8 years ago

lack of relations

Relations can often be expressed through indexing their parts. For turn restrictions, we use a point feature representing the intersection, with the properties expressing the contents of the relation. We have also been experimenting with a C++ osm.pbf tile splitter that could make relation manipulation and prep more manageable.

inability to stitch-together complete features

This is tricky. Its an unbounded problem, since one feature could be very large. Most analysis I have done with polygons did not require the full feature all at once, but just the id. With split geometries and ids, you can still calculate things like:

For many of the operations where you would want the full feature, the stitching process would invalidate the operation you were trying to perform. An example of this from QA-land would be checking for self-intersections. The vector tile creation process is fuzzy enough that you cannot do this reliably (and self-intersections are not valid as of the V2 spec).

Regardless, I have found many unexpected operations that could be performed across split features, but I would be interested to hear about cases that cannot, and still make sense from a performance perspective.

rclark commented 8 years ago

lack of relations

This is a two-pronged problem:

morganherlocker commented 8 years ago

Closing, since I believe the two main issues here have been addressed. In summary:

lack of relations

Relations are not spatial, so join them to some geometry first (most relations have members that are points, lines, or polygons, and you can combine whatever is present into a GeometryCollection). A similar spatial joining process already happens with ways, so relations are not actually particularly unique.

inability to stitch-together complete features

tippecanoe's new -pc,--no-clipping flag preserves full geometries and can be used for this. That said, using TileReduce in this way is kind of an anti-pattern, since you could end up parsing features of unbounded size with a bunch of duplicated work. "unclipping" through stitching is probably not a good avenue, since clipping is lossy and stitching is difficult/expensive. Most problems can still be solved with clipped geometry.

wboykinm commented 8 years ago

@anandthakker @morganherlocker As a parting note I'll say it'd be super-dope to have an example of the --no-clipping approach in use lying around somewhere.

tmcw commented 8 years ago

Let's keep this open until we hear that the issues are actually addressed from the BLR side; the theoretical solvability of relations does not mean that they can be solved in practice, and until osm-qa-tiles uses no-clipping, it is not relevant to their work.

e-n-f commented 8 years ago

I don't think --no-clipping can be the general solution for osm-qa-tiles because OSM contains huge polygons that would be duplicated thousands of times. It is only going to be useful for small objects like buildings and maybe roads.

mourner commented 7 years ago

Closing as stale and not actionable on the tile-reduce side. If the concerns are still not addressed, lets open corresponding tickets in our internal data team repo.