geopandas / pyogrio

Vectorized vector I/O using OGR
https://pyogrio.readthedocs.io
MIT License
269 stars 22 forks source link

Choose a consistent way of dealing with +2D geometries #225

Closed theroggy closed 1 year ago

theroggy commented 1 year ago

Remark; content here is being edited based on comments below to keep a consistent overview

Out in the wild there are 3 flavours how +2D geometry types are defined. With a "Point" layer as example: "2.5D POINT", "3D POINT" or "POINT Z".

In the pyogrio codebase and documentation some flavours are already in use:

Probably explicitly choosing a flavour to be returned in the pyogrio API (eg. pyogrio.read_info) would be a good idea. That would probably be the logical one to be used consinstently in the code base and documentation as well.

For input parameters it is also an option to support multiple flavours.

The issue became apparent in #223

jorisvandenbossche commented 1 year ago

At the least, I would prefer not to use the "2.5D" in user facing APIs, since that's just confusing what this is about (I don't even know why it is that way).

Slight preference to go with "Point Z" (that matches more with things like WKT reprs of geometries, so that should be familiar), but "3D Point" being used by GDAL (eg in the output of ogrinfo) is also a good reason.

theroggy commented 1 year ago

At the least, I would prefer not to use the "2.5D" in user facing APIs, since that's just confusing what this is about (I don't even know why it is that way).

Short answer: I'm not sure either why anyone started using it and I also think it is confusing.

Longer answer: I looked around a few days ago, and you find all sorts of explanations:

Slight preference to go with "Point Z" (that matches more with things like WKT reprs of geometries, so that should be familiar), but "3D Point" being used by GDAL (eg in the output of ogrinfo) is also a good reason.

I also think "POINT Z" is the most "readable" option, especially if you also consider the "POINT M" and "POINT ZM" variants that are also out there for some use cases.

My only worry/question was what the rest of the ecosystem is doing. But please have a look at the initial comment in this issue again, I found and added some more info... and apparently the "POINT Z" notation is the one used in the ogr2ogr documentation, so that's not a bad credential.

jorisvandenbossche commented 1 year ago

I also think "POINT Z" is the most "readable" option, especially if you also consider the "POINT M" and "POINT ZM" variants that are also out there for some use cases.

Ah, yes, keeping the option open for supporting M and ZM versions, then I think the Z suffix ("Point Z" ) is the only sensible option ("3D" is already ambiguous ..)

jorisvandenbossche commented 1 year ago

But please have a look at the initial comment in this issue again, I found and added some more info... and apparently the "POINT Z" notation is the one used in the ogr2ogr documentation, so that's not a bad credential.

I would say that is more than enough to go for "Point Z" style (GeoParquet spec is also using that).

(It seems that GDAL ogr2ogr allows both "POINTZ" and "POINT Z")