opengeospatial / geoparquet

Specification for storing geospatial vector data (point, line, polygon) in Parquet
https://geoparquet.org
Apache License 2.0
795 stars 56 forks source link

Should Z bounds be included in the bbox field? #92

Closed paleolimbot closed 1 year ago

paleolimbot commented 2 years ago

In the documentation for bbox, it seems to indicate that the bbox is always just [<xmin>, <ymin>, <xmax>, <ymax>]: https://github.com/opengeospatial/geoparquet/blob/main/format-specs/geoparquet.md#bbox . The linked RFC also allows for a 3D bounding box, but the GDAL driver doesn't write the bounding box that way (see https://github.com/OSGeo/gdal/issues/5670). Is the intention to include Z in the bounding box if the Z coordinate is present?

My vote would be to not include Z in the bbox because it's not all that useful and makes it a little bit more annoying to implement.

jorisvandenbossche commented 2 years ago

I opened a PR to correct that omission last week: https://github.com/opengeospatial/geoparquet/pull/88

But so that actually does include Z. I personally don't have a strong opinion on whether we should include it or not.

paleolimbot commented 2 years ago

Ah, sorry I missed that. For now I've dropped it from the metadata since a length-4 bbox is unambiguous regardless of dimension.

jorisvandenbossche commented 2 years ago

Do other people have thoughts about including Z values in the bbox or not?

I think there are three options (in case of >2D data):

1) The bbox is always 4 values (only x, y dim), regardless of the dimension of the data 2) The bbox can be 4 or 6 values, so optionally including z values (leaving it to the data producer to decide whether it is useful) 3) Require that the bbox strictly follows the dimensionality of the data, so 4 values for 2D geometries and 6 values for 3D geometries.

Personally I agree that just having x,y bbox will typically be sufficient, so I would prefer 1 or 2.
Option 2 might become ambiguous, though, if we would allow M values at some point.

Another complicating factor is that we currently actually allow mixed dimensionality (in practice you can have WKB values with a mixture of 2D and 3D geometries). So at least allowing to only specify 4 values even if you have >2D data seems best. But I am also fine with for now only allowing 4 values.

jorisvandenbossche commented 1 year ago

Closed by https://github.com/opengeospatial/geoparquet/pull/88 and https://github.com/opengeospatial/geoparquet/pull/145 The spec now clearly says the bbox has values for each dimension of the geometries.