OSGeo / gdal

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
https://gdal.org
Other
4.79k stars 2.51k forks source link

gdal_grid reading WKT lat lon in wrong order from VRT/CSV #2979

Open rooby opened 3 years ago

rooby commented 3 years ago

Expected behavior and actual behavior

When using gdal_grid with a CSV data source via a VRT file and using WKT I expect it to use the correct axis order.

I have data in EPSG:4326 as WKT points in a CSV file, e.g. POINT (-17.2280391687 138.9716497577), which have lat lon order.

I have created a VRT file for this CSV and am running through gdal_grid and the output file ends up being rotated 90 degrees from what I would expect. If I then edit my data source and reverse all the point values' to lon lat, e.g. POINT (138.9716497577 -17.2280391687), then the output is as I expect it to be.

As far as I know, POINT(lat long) is the correct format

I have tried with a number of variations of the SRS configuration in the VRT and also -a_srs, -txe, and -tye gdal_grid params and haven't found any settings that output what I would expect.

Steps to reproduce the problem.

  1. Put the attached files in a directory
  2. Remove the .txt from the file names (they were there for the sake of uploading to github)
  3. In the directory, run: gdal_grid -a invdist -zfield some_data -l input-data -outsize 1000 651 -ot Float32 input-data.vrt output.tiff

The output file is rotated 90 degrees from what it would be if it was reading the lat lon in the correct order. If you run this command on the input-data-flipped.vrt file it will output what I would expect to get with input-data.vrt: gdal_grid -a invdist -zfield some_data -l input-data-flipped -outsize 1000 651 -ot Float32 input-data-flipped.vrt output-flipped.tiff

Operating system

Windows 10 pro 64 bit

GDAL version and provenance

3.1.2 Also tried with 3.0.4 and 2.4.3 and it seems to have the same issue.

Example data files

input-data.vrt.txt input-data.csv.txt input-data-flipped.vrt.txt input-data-flipped.csv.txt

rooby commented 3 years ago

In a quick glance across the source files for the VRT driver I see a few usages of OAMS_TRADITIONAL_GIS_ORDER. Is this the problem? Might it be possible to explicitly specify the axis mapping strategy in the VRT file?

jratike80 commented 3 years ago

The WKT standard from 2010 OpenGIS® Implementation Standard for Geographic information - Simple feature access - Part 2: SQL option did not say anything about the coordinate order. It is possible to read the the corrigendum from year 2011 to mean that order of X and Y should change also in the WKT according to CRS:

Geometry values in R4 have points with coordinate values for x, y, z and m. The interpretation of the coordinates is subject to the coordinate reference systems associated to the point. All coordinates within a geometry object should be in the same coordinate reference systems. Each coordinate shall be unambiguously associated to a coordinate reference system either directly or through its containing geometry.

Unfortunately in the WKT presentation there is no place for telling what the CRS is. In your example you have POINT (-17.2280391687 138.9716497577) but there is no unambiguous association because only you know what the coordinates present. Well, actually you did tell it in the text so we all know it. No automatic unambiguous association then.

Traditionally WKT has been considered to be always in easting, northing or longitude, latitude order. This is how GDAL, PostGIS, JTS, and many other software and libraries deal with WKT. I am not sure about Oracle because as almost always with WKT examples, including the OGC standards, CRS is not dealt at all https://docs.oracle.com/database/121/SPATL/sdo_util-from_wktgeometry.htm#SPATL1234. I know that at least MySQL has changed this interpretation and its WKT is CRS aware nowadays. I have also seen requests that the coordinates in Well Known Binary should be flipped as well but that idea has not gained popularity. See for example https://github.com/opengeospatial/geopackage/pull/545.

rooby commented 3 years ago

I have seen it stored both ways (which is annoying).

I think my confusion here is largely a result of the move to be CRS aware, including the information at https://gdal.org/tutorials/osr_api_tut.html It seems the VRT driver does not do adhere to this.

I would expect that if the SRS is defined in the VRT then it would by default be handled using axis mapping strategy of OAMS_AUTHORITY_COMPLIANT.

On top of that, a nice to have would be that the default axis mapping strategy could be overridden in the VRT XML.

jratike80 commented 3 years ago

I am not sure how much of this issue is about VRT and how much about WKT. Perhaps it is mostly about the latter and I fear that it is impossible to handle it right so that everybody would be happy.

Currently axis order in 'POINT (10.3958 63.4269)' (without external references) is undefined. PostGIS developers created EWKT syntax 'POINT (10.3958 63.4269)',4326 that defines the CRS but axis order for PostGIS is lon-lat. For MySQL it is lat-lon, but MySQL developers also realised that even it may be principally the right order it will not work always together with existing data and when interoperability with other software is needed. Therefore they created an explicit variant of WKT ('POINT(10.3958 63.4269)', 4326, 'axis-order=long-lat') https://mysqlserverteam.com/axis-order-in-spatial-reference-systems/. That it fine but probably supported only by MySQL.

Because it is impossible to do it right we should try to do something that leads to minimum amount of errors. My suggestion is to

rouault commented 3 years ago

I could imagine a WKT_AXIS_ORDER=TRADITIONAL_GIS/AUTHORITY_COMPLIANT and WKT_CRS=AUTH:CODE or CRS WKT open options for the CSV driver to define the order of coordinates and the CRS (you would need both so that the driver can do the right thing). The fun question is if you do WKT_AXIS_ORDER=AUTHORITY_COMPLIANT and WKT_CRS=EPSG:4326, should the driver report as

rooby commented 3 years ago

@jratike80

I am not sure how much of this issue is about VRT and how much about WKT. Perhaps it is mostly about the latter and I fear that it is impossible to handle it right so that everybody would be happy.

Ah right. I'm not yet familiar enough with the workings of GDAL to know exactly where the issue lies in this case.

  • Perhaps it could be an open option "swap_WKT_axes" in the csv driver. Then it could be used also in VRT via OpenOptions.

This is the approach I have taken as the current workaround for my application. I have a flag in my process that flips the POINT() coords as required, before processing the data with GDAL. It's not the ideal solution due to extra preprocessing of potentially very large files, but it works.

rooby commented 3 years ago

@rouault That's pretty much what I was thinking.

The fun question is if you do WKT_AXIS_ORDER=AUTHORITY_COMPLIANT and WKT_CRS=EPSG:4326, should the driver report as

I would think it would be (first axis=lat, second axis=lon) and geometries in lat, lon order, and you would specify WKT_AXIS_ORDER=OAMS_TRADITIONAL_GIS_ORDER to use (first axis=lon, second axis=lat) and geometries in lon, lat order.

I've only fairly recently dug into this axis order issue however from what I've been reading it seems that the industry is generally moving to long lat order being the legacy option and lat lon being the default, so I figure GDAL 3.x would also go the same way. For example:

jratike80 commented 3 years ago

I would narrow the discussion to deal with the textual presentation of geometries as WKT but before that one more link https://github.com/opengeospatial/geopackage/pull/545 that may interest you as well.

The question is not so much about if the axis order should follow the official definition of the CRS, if the format permits. That, as you prove with your references, is generally accepted. But does WKT belong to those encodings which support swapping the order of coordinates by the CRS? I would say that perhaps, but only partially. For me is seems like the authors of the WKT standard (and WMS 1.0, perhaps even GML 2.0) were thinking "the first axis goes left to right on paper and on my computer screen, and the second one goes from bottom to top". Unfortunately the well defined examples in OpenGIS® Implementation Standard for Geographic information - Simple feature access - Part 2: SQL option are using UTM coordinates (POSC:32214) with official easting-northing order. Examples with CRS like EPSG:4326 or EPSG:2393 would have saved a lot of pain. Or writing the left-to-right and bottom to top rule into the standard.

If/because CRS and axis order are important in WKT then what is really needed is a new WKT specification that explicitly defines how to write at least CRS directly into WKT. I would prefer writing the axis order as plain text as well because WKT is often meant for humans eyes. Otherwise those poor humans must check from external sources what is the difference between these two, and if one or another is possibly wrong.

POINT (429510.98 6849630.33), 3067
POINT (429510.98 6849630.33), 5048

EDIT

Also by looking at Figure 2 in the standard the author did not have lat-long order in their mind.

kuva

ldesousa commented 2 years ago

There is a discussion at StackExchange that could be relevant for this issue. The concept of WKT having a fixed coordinate order is an assumption made by the software. In its standards the OGC applies WKT always considering coordinate order to be dependent on the associated CRS. This is specified explicitly in GeoSPARQL, for instance. GML is another example of note.

The argument that WKT has no feature to convey a CRS is a mute one in this context, imho, since programmes in the GDAL/OGR constellation require the declaration of input/ouput CRSs. Furthermore, the OGC usually defines a default CRS when one is not given (again GeoSPARQL is a good example).

Coordinate order in GDAL/OGR is an issue that goes beyond WKT, manifesting itself in other aspects of the library. Take for instance the example below with gdal_transform. Even though provided with two geodesic CRSs, the programme requests an "X" and a "Y", not angles, immediatly creating doubt. And then the result is given with a swapped coordinated order.

$ gdaltransform -s_srs EPSG:4326 -t_srs CRS:84
Enter X Y [Z [T]] values separated by space, and press Return.
10 20
10 20 0

Contrast that with the same operation conducted with Proj:

$ echo 10 20 0 | cs2cs "EPSG:4326" "http://www.opengis.net/def/crs/OGC/1.3/CRS84"
20dE  10dN 0.000

There is probably a broader reflection to be had on the behaviour of GDAL/OGR with geodesic CRSs.

jratike80 commented 2 years ago

I think that GDAL knows the axis order difference between EPSG:4326 and CRS:84 but GDAL utilities are using the "traditional GIS order" https://trac.osgeo.org/gdal/wiki/rfc73_proj6_wkt2_srsbarn. Changing the behavior of old programs would make some users happy, some others unhappy.

What axis labels the programs show users when they request input is another question, but X Y Z M are also used throughout the OGC standards like "Implementation Standard for Geographic information - Simple feature access - Part 2: SQL option".

since programmes in the GDAL/OGR constellation require the declaration of input/ouput CRSs.

Quite many GDAL programs deal well with unknown CRS

gdalcrstest.csv
===============
id,wkt
1,"POINT (1 5)"
ogrinfo gdalcrstest.csv -al
INFO: Open of `gdalcrstest.csv'
      using driver `CSV' successful.

Layer name: gdalcrstest
Geometry: Unknown (any)
Feature Count: 1
Extent: (1.000000, 5.000000) - (1.000000, 5.000000)
Layer SRS WKT:
(unknown)
id: String (0.0)
wkt: String (0.0)
OGRFeature(gdalcrstest):1
  id (String) = 1
  wkt (String) = POINT (1 5)
  POINT (1 5)

Conversion into GML yields a GML that has no srsName element at any level.

<ogr:geometryProperty><gml:Point gml:id="gdalcrstest.geom.0"><gml:pos>1 5</gml:pos></gml:Point></ogr:geometryProperty>

ldesousa commented 2 years ago

@jratike80 It is good to have user hapiness in mind. The question is why GDAL/OGR only targets users that do not use standards? It should be possible to make users working with standards happy too. I am not expecting that to happen overnight, but why can't it be a goal?

jratike80 commented 2 years ago

I like joking but on other forums than in GitHub or Internet. GDAL/OGR does have something to offer also for those who use standards: GML, WMS, WMTS, WFS, WCS, KML, GeoPackage, GeoTIFF, GML in JPEG 2000, OGC API Features, even most part of the Simple Feature Access despite GDAL does not honor automatically the CRS dependent axis order of WKT geometries.

I think that most arguments about WKT/WKT were expressed already 15 years ago https://social.msdn.microsoft.com/Forums/sqlserver/en-US/41250c42-25e6-4de7-953e-a6c41ada383f/xy-coords-reversed-in-wkt-amp-wkb-methods-in-nov-ctp?forum=sqlspatial (I did not know that MS Spatial then changed into lon-lat https://docs.microsoft.com/en-us/archive/blogs/isaac/the-upcoming-geography-coordinate-order-swap-a-faq).

You mentioned GeoSPARQL as a good example and I agree with you, because GeoSPARQL requires that if WKT is not in CRS84 it must also present the CRS
"<http://www.opengis.net/def/crs/EPSG/0/4326> Point(33.95 -83.38)"^^<http://www.opengis.net/ont/geosparql#wktLiteral>

What I would like to see as a goal is a new version of the WKT geometry standard that defines an explicit method for attaching CRS permanently into WKT, and define how the old fuzzy format POINT (1 5) or POINT (2774537 8438308) should be interpreted.

nmtoken commented 1 year ago

https://gdal.org/drivers/vector/kml.html is incorrect in stating that KML is EPSG:4326. KML standard isn't EPSG:4326, it is according to the standard document (https://docs.ogc.org/is/12-007r2/12-007r2.html) Appendix B KML Coordinate Reference System Definition (Normative) http://www.opengis.net/def/crs/OGC/0/LonLat84 so you could avoid all confusion by just dropping the reference to EPSG:4326

jratike80 commented 1 year ago

There is a Edit on GitHub link in the upper right corner of the page. I would mention that the KML system is similar to EPSG:4326 but with reversed lon/lat. Also OGC tries to give this information in "scope".

https://www.opengis.net/def/crs/OGC/0/LonLat84

<identifier codeSpace="http://www.opengeospatial.org/ogcna">http://www.opengis.net/def/crs/OGC/0/LonLat84</identifier>
<name>WGS 84 with long/lat axis order</name>
<scope>KML geographic 2D coordinate reference system (adapted from EPSG-4326).</scope>