georust / geozero

Zero-Copy reading and writing of geospatial data.
Apache License 2.0
322 stars 30 forks source link

Implement screen coord translation for MVT writer #150

Closed pka closed 11 months ago

pka commented 11 months ago

Contains a breaking change in the ToMvt signatures. @nyurik Is that ok for you?

nyurik commented 11 months ago

Hi @pka, i'm ok with changing signatures, as long as we obey the versioning rules (bump the minor). Otherwise we could go the easier route and simply not change the existing method, but instead add a new as_mvt_scaled. Either is fine i guess.

That said, I am not too certain about the API itself. If we take PostGIS's MVT as a guide, it has 2 components: envelope and extent. The extent is a positive integer (same for both X and Y), and should probably be named extent. The envelope is a trickier one - technically MVT never mandates things to be in the 3857 (web mercator), but that's usually the case. Could you elaborate of your thinking with the API? Thx!

pka commented 11 months ago

Thanks for pointing to Postgis. My starting point was the Morecantile API, on which the rewrite of tile-grid as an OGC API Tiles implementation was based. OGC has a type boundingBox with lowerLeft and upperRight, but also other variants and tileWidth/tileHeight. Since MVT tiles are quadratic, I ended up with tile_size. But extent is ok too, since this is the term from the MVT spec. The proposed bounds are in geometric coordinates without tile borders like ST_AsMVTGeom. So you can directly use the calculated bounds from the tile matrix set.

I'm for changing the signature with an appropriate version bump and I'm fine with changing tile_size to extent, to match the MVT spec.

pka commented 11 months ago

What speaks against extent is, that it's often used as an other term for bounds. It's not wrong, but rather unusual for describing a tile size in pixels.

nyurik commented 11 months ago

thx, yes, i do think extent is a much better thing here simply to match the spec - creates fewer "new words" keeps things easier.

The conversion of any given geo-object (i.e. geojson) to MVT is a tricky subject because of how it is possible to do it in multiple ways. So, given a geojson, a user may want to:

So for the first case, MVTWriter should probably have the knowledge of the current tile its writing to (for the first usecase), but for the second it would simply receive geometry in the local tile coordinate space. From the optimization perspective, those coordinates should already be integers.

pka commented 11 months ago

We should stay with the minimal useful conversion function and not try to write a full ST_AsMVTGeom with generalization, clipping, etc. That can be done by a tile server engine using geozero. For translating geometries into MVT, tile size (extent) and bounds have to be known. As far as I remember, MVT geometries can be concatenated. So you could also merge to the output of multiple geometries into a single feature. The tile coordinate use case is rather exotic, but makes maybe sense for a Planetiler-like transformation pipeline. Since geozero has only a f64 coordinates API, this use case is not easy to support.

nyurik commented 11 months ago

I agree that (at least initially) geozero should focus on the MVP. The current mvp essentially has no re-projection, so it doesn't do proper conversion to a planar coordinate system, possibly resulting in an error. Your modification adds simple scaling that assumes that the initial coordinates are in the 0th tile of the planar projection, and you simply scale it to the right zoom/x/y location. I guess this is needed, but in general, esp when dealing with geojson, the input is likely to be in a 4326, so there should be an easy way to reproject as part of the conversion... which I am not too certain how to do in geozero context yet.

WRT concatenations, AFAIK, you can only concatenate MVT Layers (because they are at the top of the MVT spec). I used that trick in OpenMapTiles to generate MVTs directly from PostGIS, concatenating multiple ST_AsMVT calls together, each generating individual layer.

WRT f64 - yep, makes sense, i guess we will have to wait for the georust to support proper multi-type geotypes, including the ones with integers, and also migrate geozero to those types?

nyurik commented 11 months ago

P.S. I think the original to_mvt should be renamed to to_mvt_raw -- simply because a "tile" without any scaling has no meaning. That said, it seems other conversions like to_json do not make much sense either - simply because there is no point in converting an MVT tile to geojson without the transformation... Should we perhaps remove it?

pka commented 11 months ago

Good point regarding geographic coordinates. Projecting to planar coordinates is indeed an important step which is not integrated in geozero yet. I made some experiments with processing chains a long time ago, but they weren't successful. to_json is still useful for MVT inspection having a text format including non-geometric properties (unlike WKT).

nyurik commented 11 months ago

I did a few minor optimizations in #152 that would be merged into this PR first, and then I can merge this PR