OSGeo / gdal

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
https://gdal.org
Other
4.9k stars 2.55k forks source link

GPKG: srs_id -1 for "Undefined cartesian SRS" has meter units, should this be "undefined" units? #9580

Closed theroggy closed 6 months ago

theroggy commented 7 months ago

What is the bug?

According to the GeoPackage specs, data in an "Undefined Cartesian SRS" should be saved with srs_id = -1: https://www.geopackage.org/spec/#r11.

In GDAL this was implemented in these PR's:

However, for reading as well as writing srs_id=-1 the unit is set/expected to be "meter". So, only if the unit is meter, the srs_id in GPKG will be set to srs_id -1.

For information, in r-spatial (https://github.com/r-spatial/sf/issues/2049) the choice was made to explicitly to use a WKT that specifies the units to be undefined when saving data without crs specified to GPKG, leading to a "custom CRS" being saved in the GPKG, so an srs_id >= 100.000 instead of -1, which doesn't seem to be according to the GPKG specs?

So following questions rise:

Issue surfaced when implementing support for "Undefined Cartesian SRS" in pyogrio: https://github.com/geopandas/pyogrio/pull/368

Steps to reproduce the issue

xref: https://github.com/geopandas/pyogrio/pull/368

Versions and provenance

All versions of GDAL

Additional context

No response

CC: @edzer , @martinfleis, @jorisvandenbossche

jratike80 commented 7 months ago

Maybe you could ask a question also in here https://github.com/opengeospatial/geopackage/issues. I do not remember any discussion about the units, I think that everybody was just thinking about meters, but of course in real world data the units can be something else. On the other hand, the 2D engineering CRS that was defined for the JSON-FG, is using explicitly meters https://www.[opengis.net/def/crs/OGC/0/Engineering2D](https://www.opengis.net/def/crs/OGC/0/Engineering2D) and it would be nice if two OGC standards would have similar interpretation about the units.

rouault commented 7 months ago

It is tricky to do better than what we do currently. GDAL uses PROJ bases CRS models, which are OGC Topic 2 based, and for Unknown cartesian CRS we have opted for the rather loose EngineeringCRS representation, but an EngineeringCRS is abased on a Cartesian coordinate system, and the axis of such system must have a unit. Opting for metre is the obvious default. The issue fundamentally staims at the specification, which defines a "Unknown cartesian CRS" concept but which has no clear equivalent in OGC Topic 2.

The trick used by r-spatial to use a 'ENGCRS["Undefined Cartesian SRS with unknown unit",EDATUM["Unknown engineering datum"],CS[Cartesian,2],AXIS["x",unspecified,ORDER[1],LENGTHUNIT["unknown",0]],AXIS["y",unspecified,ORDER[2],LENGTHUNIT["unknown",0]]]' definition is kind of a hack. I'm pretty sure that Topic 2 purists would frown upon seeing a LENGTHUNIT["unknown",0] ... This is using the framework of a CRS definition, but really to express that you don't know anything about it except there are 2 axis which are orthogonal

I'm not sure there's an action on the GDAL side regarding this. You can manually you that above definition if needed. The only potential other action would be to interact with the GeoPackage working group to ask / propose that the srs_id = -1 and 0 object expand to a agreed WKT definition

Well, one thing that crossed my mind is that currently we don't have the possiblity in GeoPackage to have a spatial layer without any CRS at all (which basically all other formats allow). It is necessary mapped to a valid entry in gpkg_spatial_ref_sys. A possibility would be that GDAL registers in the gpkg_spatial_ref_sys table the following record

srs_name (String) = Undefined CRS srs_id (Integer64) = -2 organization (String) = GDAL organization_coordsys_id (Integer64) = 0 definition (String) = undefined description (String) = undefined coordinate reference system

and that it maps that to the null CRS.

jratike80 commented 7 months ago

Yes, sure. Corrected.

I do not remember who suggested to have both 0 and -1 for unknowns when the standard was written, but I think that it does make sense. If the system is known to be cartesian then it is possible to compute lengths and areas with cartesian formulas, otherwise not.

theroggy commented 7 months ago

I do not remember who suggested to have both 0 and -1 for unknowns when the standard was written, but I think that it does make sense. If the system is known to be cartesian then it is possible to compute lengths and areas with cartesian formulas, otherwise not.

This is also the reason we would like to make a change when writing GeoPandas GeoDataFrames: GeoPandas only supports doing cartesian calculations, but when writing a GeoPackage without crs the current default is srs_id 0, so geographic. A cartesian default makes a lot more sense here...

jratike80 commented 7 months ago

A cartesian default makes a lot more sense here...

At least when the coordinates to not fit within [-180,180 / 0,360] and [-90,90] ranges.

rouault commented 7 months ago

A cartesian default makes a lot more sense here...

You know, a lot of people think that a georeferenced position equals a longitude + a latitude. Only enlightened minds know about the projected CRS :-) For example the default for GeoParquet is EPSG:4326.

edzer commented 7 months ago

You know, a lot of people think that a georeferenced position equals a longitude + a latitude. Only enlightened minds know about the projected CRS :-) For example the default for GeoParquet is EPSG:4326.

Agreed, another lot of people working with spatial data do not know the difference between the two. And a lot of spatial data is non-geospatial, but coming from telescopes, medical scanners, microscopes, motion sensors, and so on - where again coordinates may be angles or distances.

theroggy commented 7 months ago

And a lot of spatial data is non-geospatial, but coming from telescopes, medical scanners, microscopes, motion sensors, and so on - where again coordinates may be angles or distances.

@edzer I'm aware of the non-geospatial domain, but not familiar with it. For the cartesian examples there... what are common units being used/relevant, or possibly there are also explicit unitless examples?

edzer commented 7 months ago

there are also explicit unitless examples?

Units can be anything, an example of (usually) unitless is hand-drawn sketchmaps.

rouault commented 7 months ago

Units can be anything, an example of (usually) unitless is hand-drawn sketchmaps.

The CRS concept (at least with the OGC Topic 2 / ISO-19111 geodesy oriented model) is probably much too elaborated and a poor fit. Reported no CRS at all would likely be a better match than a EngineeringCRS which normally involves more solid grounds to establish one

edzer commented 7 months ago

Reported no CRS at all would likely be a better match than a EngineeringCRS

Agreed, except that the interpretation is usually "just Cartesian", and not "just geodetic"; would no CRS even have to leave that open? (answered with "no" in r-spatial and geopandas)

rouault commented 7 months ago

My proposed resolution in https://github.com/OSGeo/gdal/pull/9595