r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.31k stars 292 forks source link

Handling local CRS, st_can_transform() #2049

Closed edzer closed 8 months ago

edzer commented 1 year ago

This came up here: https://github.com/dieghernan/tidyterra/issues/64 ; Cc @rhijmans

Since GPKG (for good reasons!), and I believe also geoparquet, need to know on writing whether coordinates are ellipsoidal or Cartesian, we do write for objects with NA crs the wkt LOCAL_CS["Undefined Cartesian SRS"]. This is returned as such on reading:

library(sf)
# Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.1; sf_use_s2() is TRUE
st_sf(a=1,geom=st_sfc(st_point(0:1))) |> write_sf("x.gpkg")Undefined Cartesian SRS
# writing GPKG: substituting LOCAL_CS["Undefined Cartesian SRS"] for missing CRS
(y = read_sf("x.gpkg"))
# Simple feature collection with 1 feature and 1 field
# Geometry type: POINT
# Dimension:     XY
# Bounding box:  xmin: 0 ymin: 1 xmax: 0 ymax: 1
# Projected CRS: Undefined Cartesian SRS
# # A tibble: 1 × 2
#       a        geom
#   <dbl> <POINT [m]>
# 1     1       (0 1)
is.na(st_crs(y))
# [1] FALSE

(if you write nothing, GDAL will assume and write the default Undefined geographic SRS).

In #1776 the suggestion came up to use Undefined Cartesian SRS instead of NA (as this is how sf and sp always handled it: NA of course doesn't reveal whether coordinates are geographic or Cartesian, but following a long GIS legacy we handled such data as Cartesian for plotting and area and distance computation, units being implicitly those of the coordinates). Using that definition has the problem that

st_crs('LOCAL_CS["Cartesian (Meter)",LOCAL_DATUM["Local Datum",0],UNIT["Meter",1.0],AXIS["X",EAST],AXIS["Y",NORTH]]') == st_crs('LOCAL_CS["Undefined Cartesian SRS"]')
# FALSE

which is obvious: the first has units and axis directions, but this way a that potentially a large number of CRS definitions may come up as valid WKT, be essentially identical but test as different.

So far we've essentially handled two cases: CRS missing, or CRS given and compatible with any other Earth related CRS. Now new cases come up.

The Undefined Cartesian SRS breaks plotting with ggplot2, as that checks for is.na(crs), which is not TRUE, and then assumes the crs can be transformed (a graticule is drawn in EPSG:4326 and transformed to the data CRS, but this fails on st_transform()). Following @rhijmans suggestion, I added a function st_can_transform() that checks whether two CRS can be transformed into each other; this is now in sf branch can_transform

The question is how to go forward: we need to do one of the following:

I'm also thinking about transformations on datums of other celestial bodies; PROJ seems to be able to do this; we're r-spatial, not r-geo; st_graticule may have to be generalized in this respect. I think we're heading for the second option.

rsbivand commented 1 year ago

With sf main I see:

> library(sf)
Linking to GEOS 3.11.1, GDAL 3.6.0, PROJ 9.1.1; sf_use_s2() is TRUE
> tf <- tempfile(fileext=".gpkg")
> st_sf(a=1,geom=st_sfc(st_point(0:1))) |> st_write(tf)
writing GPKG: substituting LOCAL_CS["Undefined Cartesian SRS"] for missing CRS
Writing layer `file132baa52ef1fee' to data source 
  `/tmp/RtmpXvMZLr/file132baa52ef1fee.gpkg' using driver `GPKG'
Writing 1 features with 1 fields and geometry type Point.
> yy <- st_read(tf)
Reading layer `file132baa52ef1fee' from data source 
  `/tmp/RtmpXvMZLr/file132baa52ef1fee.gpkg' using driver `GPKG'
substituting CRS NA for "Undefined Cartesian SRS"
Simple feature collection with 1 feature and 1 field
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 0 ymin: 1 xmax: 0 ymax: 1
CRS:           NA
> is.na(st_crs(yy))
[1] TRUE
> terra::vect(tf)
 class       : SpatVector 
 geometry    : points 
 dimensions  : 1, 1  (geometries, attributes)
 extent      : 0, 0, 1, 1  (xmin, xmax, ymin, ymax)
 source      : file132baa52ef1fee.gpkg
 coord. ref. : Undefined Cartesian SRS 
 names       :     a
 type        : <num>
 values      :     1
 sp::proj4string(rgdal::readOGR(tf))
OGR data source with driver: GPKG 
Source: "/tmp/RtmpXvMZLr/file132baa52ef1fee.gpkg", layer: "file132baa52ef1fee"
with 1 features
It has 1 fields
[1] NA
Warning messages:
1: OGR support is provided by the sf and terra packages among others 
2: OGR support is provided by the sf and terra packages among others 
3: OGR support is provided by the sf and terra packages among others 
4: OGR support is provided by the sf and terra packages among others 
5: OGR support is provided by the sf and terra packages among others 
6: OGR support is provided by the sf and terra packages among others 
7: OGR support is provided by the sf and terra packages among others

In rgdal, the WKT2 string is returned, but because exportToProj4() fails, the CRS is known to be NA_STRING.

So while current sf HEAD on main and sp/rgdal do this right, terra has no mechanism in place to indicate that a WKT2 LOCAL_CS is actually NA planar. I assume that the code/output above is from the can_transform branch. I don't think other planets are important.

I'm not planning to make any changes to rgdal, but if can_transform is merged, we will need to review sf::crs/sp::CRS coercion and probably any subsequent fall-out for spTransform().

Basically, this provides additional support for my strong view that ggplot should never have imposed graticules, as nobody I know implicitly knows the geographical coordinates for places on interest on any map; they just are never used for orientation as a grid might be on an xy plot.

rsbivand commented 1 year ago

Curiously:

> library(RSQLite)
> db <- DBI::dbConnect(RSQLite::SQLite(), tf)
> DBI::dbListTables(db)
 [1] "file132baa52ef1fee"                  
 [2] "gpkg_contents"                       
 [3] "gpkg_extensions"                     
 [4] "gpkg_geometry_columns"               
 [5] "gpkg_ogr_contents"                   
 [6] "gpkg_spatial_ref_sys"                
 [7] "gpkg_tile_matrix"                    
 [8] "gpkg_tile_matrix_set"                
 [9] "rtree_file132baa52ef1fee_geom"       
[10] "rtree_file132baa52ef1fee_geom_node"  
[11] "rtree_file132baa52ef1fee_geom_parent"
[12] "rtree_file132baa52ef1fee_geom_rowid" 
[13] "sqlite_sequence"                     
> DBI::dbReadTable(db, "gpkg_spatial_ref_sys")
                  srs_name srs_id organization organization_coordsys_id
1  Undefined Cartesian SRS     -1         NONE                       -1
2 Undefined geographic SRS      0         NONE                        0
3          WGS 84 geodetic   4326         EPSG                     4326
                                                                                                                                                                                                                                                                                                      definition
1                                                                                                                                                                                                                                                                                                      undefined
2                                                                                                                                                                                                                                                                                                      undefined
3 GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]
                                                               description
1                          undefined Cartesian coordinate reference system
2                         undefined geographic coordinate reference system
3 longitude/latitude coordinates in decimal degrees on the WGS 84 spheroid
> DBI::dbReadTable(db, "gpkg_geometry_columns")
          table_name column_name geometry_type_name srs_id z m
1 file132baa52ef1fee        geom              POINT     -1 0 0
edzer commented 1 year ago

substituting CRS NA for "Undefined Cartesian SRS"

I was planning to revert (undo) https://github.com/r-spatial/sf/commit/d016301c18e4ce4c284522f347478bf25fe84075, which is responsible for that.

Basically, this provides additional support for my strong view that ggplot should never have imposed graticules, as nobody I know implicitly knows the geographical coordinates for places on interest on any map; they just are never used for orientation as a grid might be on an xy plot.

That would be another issue. I think they are easier to understand than grids in projected coordinates, and their shape may help understanding what kind of projection took place.

rsbivand commented 1 year ago

Yes, I understand.

Might an RFC make sense? This reaches packages like tmap too, and mapview to transform to Web Mercator.

I'll actually need to modify rgdal too, the geometry is associated with srs_id 0; this may also be the crs/CRS coercion method:

> x <- st_sf(a=1,geom=st_sfc(st_point(0:1)))
> tf1 <- tempfile(fileext=".gpkg")
> tf1
[1] "/tmp/RtmpXvMZLr/file132baa6d4fe223.gpkg"
> rgdal::writeOGR(as(x, "Spatial"), tf1, , layer="file132baa6d4fe223", driver="GPKG")
Warning messages:
1: OGR support is provided by the sf and terra packages among others 
2: OGR support is provided by the sf and terra packages among others 
> db <- DBI::dbConnect(RSQLite::SQLite(), tf1)
> DBI::dbReadTable(db, "gpkg_spatial_ref_sys")
                  srs_name srs_id organization organization_coordsys_id
1  Undefined Cartesian SRS     -1         NONE                       -1
2 Undefined geographic SRS      0         NONE                        0
3          WGS 84 geodetic   4326         EPSG                     4326
                                                                                                                                                                                                                                                                                                      definition
1                                                                                                                                                                                                                                                                                                      undefined
2                                                                                                                                                                                                                                                                                                      undefined
3 GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]
                                                               description
1                          undefined Cartesian coordinate reference system
2                         undefined geographic coordinate reference system
3 longitude/latitude coordinates in decimal degrees on the WGS 84 spheroid
> DBI::dbReadTable(db, "gpkg_geometry_columns")
          table_name column_name geometry_type_name srs_id z m
1 file132baa6d4fe223        geom              POINT      0 0 0

Further:

> st_crs('LOCAL_CS["Undefined Cartesian SRS"]')
Coordinate Reference System:
  User input: LOCAL_CS["Undefined Cartesian SRS"] 
  wkt:
ENGCRS["Undefined Cartesian SRS",
    EDATUM[""],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]]]
> sp::CRS('LOCAL_CS["Undefined Cartesian SRS"]')
Error in sp::CRS("LOCAL_CS[\"Undefined Cartesian SRS\"]") : NA
> as(st_crs('LOCAL_CS["Undefined Cartesian SRS"]'), "CRS")
Coordinate Reference System:
Deprecated Proj.4 representation: NA 
WKT2 2019 representation:
ENGCRS["Undefined Cartesian SRS",
    EDATUM[""],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]]] 

My argument with graticules in ggplot is that they may be added for the purpose you describe (world/continental scales) but should not be required as the default setting for maps. Adding them as a non-default decoration is fine - the issue is that thematic cartography in general avoids that kind of unnecessary ink, with the exception of conveying the message that projection distortion is or may be in play. That is a small minority of use cases, I feel.

rsbivand commented 1 year ago
> tf1a <- tempfile(fileext=".gpkg")
> xa <- as(x, "Spatial")
> slot(xa, "proj4string")
Coordinate Reference System:
Deprecated Proj.4 representation: NA 
WKT2 2019 representation:
ENGCRS["Undefined Cartesian SRS",
    EDATUM[""],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]]] 
> tf1a
[1] "/tmp/RtmpXvMZLr/file132baa778cba84.gpkg"
> rgdal::writeOGR(xa, tf1a, layer="file132baa778cba84", driver="GPKG")
Warning messages:
1: OGR support is provided by the sf and terra packages among others 
2: OGR support is provided by the sf and terra packages among others 
> db <- DBI::dbConnect(RSQLite::SQLite(), tf1a)
> DBI::dbReadTable(db, "gpkg_spatial_ref_sys")
                  srs_name srs_id organization organization_coordsys_id
1  Undefined Cartesian SRS     -1         NONE                       -1
2 Undefined geographic SRS      0         NONE                        0
3          WGS 84 geodetic   4326         EPSG                     4326
                                                                                                                                                                                                                                                                                                      definition
1                                                                                                                                                                                                                                                                                                      undefined
2                                                                                                                                                                                                                                                                                                      undefined
3 GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]
                                                               description
1                          undefined Cartesian coordinate reference system
2                         undefined geographic coordinate reference system
3 longitude/latitude coordinates in decimal degrees on the WGS 84 spheroid
> DBI::dbReadTable(db, "gpkg_geometry_columns")
          table_name column_name geometry_type_name srs_id z m
1 file132baa778cba84        geom              POINT     -1 0 0
> DBI::dbDisconnect(db)
rsbivand commented 1 year ago

With planar_coords.csv and geog_coords.csv from https://gdal.org/drivers/vector/csv.html#vector-csv and using ogr2ogr because all the affected R packages use the same GDAL drivers:

library(sf)
tf0 <- tempfile(fileext=".gpkg")
plf <- "planar_coords.csv" # 1000 added to Y coordinates
gdal_utils("vectortranslate", plf, tf0, options=c("-oo", "X_POSSIBLE_NAMES=Lon*", "-oo", "Y_POSSIBLE_NAMES=Lat*", "-oo", "KEEP_GEOM_COLUMNS=NO"))
st_crs(st_read(tf0))
tf1 <- tempfile(fileext=".gpkg")
gdal_utils("vectortranslate", plf, tf1, options=c("-a_srs", 'LOCAL_CS["Undefined Cartesian SRS"]', "-oo", "X_POSSIBLE_NAMES=Lon*", "-oo", "Y_POSSIBLE_NAMES=Lat*", "-oo", "KEEP_GEOM_COLUMNS=NO"))
st_crs(st_read(tf1))

The GPKG driver inserts "Undefined geographic SRS" if an SRS is not declared, even when the coordinates would be invalid.

ggf <- "geog_coords.csv"
tf2 <- tempfile(fileext=".gpkg")
gdal_utils("vectortranslate", ggf, tf2, options=c("-oo", "X_POSSIBLE_NAMES=Lon*", "-oo", "Y_POSSIBLE_NAMES=Lat*", "-oo", "KEEP_GEOM_COLUMNS=NO"))
st_crs(st_read(tf2))
tf3 <- tempfile(fileext=".gpkg")
gdal_utils("vectortranslate", ggf, tf3, options=c("-a_srs", 'LOCAL_CS["Undefined Cartesian SRS"]', "-oo", "X_POSSIBLE_NAMES=Lon*", "-oo", "Y_POSSIBLE_NAMES=Lat*", "-oo", "KEEP_GEOM_COLUMNS=NO"))
st_crs(st_read(tf3))

So to keep missing SRS as planar, 'LOCAL_CS["Undefined Cartesian SRS"]' has to be inserted. A further problem is that a missing SRS will be written with GPKG as "Undefined geographic SRS" and will pass st_can_transform() with ballpark accuracy:

st_can_transform(st_read(tf0), "EPSG:28892") # TRUE
sf_proj_pipelines(st_crs(st_read(tf0)), "EPSG:28992")$accuracy # NA
st_can_transform(st_read(tf2), "EPSG:28992") # TRUE
sf_proj_pipelines(st_crs(st_read(tf2)), "EPSG:28992")$accuracy # NA

If a generic planar is imposed:

st_can_transform(st_read(tf1), "EPSG:28992") # FALSE
sf_proj_pipelines(st_crs(st_read(tf1)), "EPSG:28992")$accuracy # NULL
st_can_transform(st_read(tf3), "EPSG:28992") # FALSE
sf_proj_pipelines(st_crs(st_read(tf3)), "EPSG:28992")$accuracy # NULL
rsbivand commented 1 year ago

Note also https://gdal.org/drivers/vector/gpkg.html#coordinate-reference-systems, where 0 seems to be the default if nothing is given, but may be set planar.

rsbivand commented 1 year ago

And:

st_is_longlat(st_read(tf2)) # TRUE
st_is_longlat(st_read(tf0)) # TRUE
# Warning message: In st_is_longlat(st_read(tf0)) : bounding box has potentially an invalid value range for longlat data
terra::is.lonlat(terra::vect(tf2)) # TRUE
terra::is.lonlat(terra::vect(tf0)) # TRUE

For s2:

st_as_s2(st_read(tf0))
# <geodesic s2_geography[3] with CRS=OGC:CRS84>
# Warning message: In st_is_longlat(x) : bounding box has potentially an invalid value range for longlat data
st_as_s2(st_read(tf2))
# <geodesic s2_geography[3] with CRS=OGC:CRS84>
rsbivand commented 1 year ago

Next step: what we need is not:

ENGCRS["Undefined Cartesian SRS",
    EDATUM[""],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["Meter",1]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["Meter",1]]]

but we certainly need to avoid asserting the units as metres, for example by: ENGCRS_no_units.zip

readLines("ENGCRS_no_units.json") |> paste(collapse="\n") |> st_crs()
> readLines("ENGCRS_no_units.json") |> paste(collapse="\n") |> st_crs()
Coordinate Reference System:
  User input: {
  "$schema": "https://proj.org/schemas/v0.5/projjson.schema.json",
  "type": "EngineeringCRS",
  "name": "Undefined Cartesian SRS (unknown units)",
  "datum": {
    "name": ""
  },
  "coordinate_system": {
    "subtype": "Cartesian",
    "axis": [
      {
        "name": "Easting",
        "abbreviation": "E",
        "direction": "east"
      },
      {
        "name": "Northing",
        "abbreviation": "N",
        "direction": "north"
      }
    ]
  }
} 
  wkt:
ENGCRS["Undefined Cartesian SRS (unknown units)",
    EDATUM[""],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1]],
        AXIS["(N)",north,
            ORDER[2]]]

However:

readLines("ENGCRS_no_units.json") |> paste(collapse="\n") |> st_crs() -> ENGCRS
as(st_crs(ENGCRS), "CRS")
# OGR: Corrupt data Error in CPL_crs_parameters(x) : OGR error
> st_crs(ENGCRS)$wkt
[1] "ENGCRS[\"Undefined Cartesian SRS (unknown units)\",\n    EDATUM[\"\"],\n    CS[Cartesian,2],\n        AXIS[\"(E)\",east,\n            ORDER[1]],\n        AXIS[\"(N)\",north,\n            ORDER[2]]]"
> st_crs(ENGCRS)$proj4
OGR: Corrupt data
Error in CPL_crs_parameters(x) : OGR error

And:

> xx <- st_read(tf1)
Reading layer `planar_coords' from data source 
  `/tmp/RtmpwmurtS/file1566855611a05f.gpkg' using driver `GPKG'
Simple feature collection with 3 features and 1 field
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 0.25 ymin: 1047.5 xmax: 1.1 ymax: 1049.2
Projected CRS: Undefined Cartesian SRS
> st_crs(xx) <- ENGCRS
OGR: Corrupt data
Warning messages:
1: In CPL_crs_equivalent(e1, e2) : GDAL Error 1: buildCS: missing UNIT
2: st_crs<- : replacing crs does not reproject data; use st_transform for that 
> xx
Simple feature collection with 3 features and 1 field
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 0.25 ymin: 1047.5 xmax: 1.1 ymax: 1049.2
OGR: Corrupt data
Error in CPL_crs_parameters(x) : OGR error

So we are obliged to accept the probably wrong metre unit.

rsbivand commented 1 year ago

Have posted:

https://lists.osgeo.org/pipermail/proj/2022-December/010840.html

rsbivand commented 1 year ago

Interesting and constructive response from @rouault : https://lists.osgeo.org/pipermail/proj/2022-December/010842.html with

ENGCRS["Undefined Cartesian SRS with unknown unit",EDATUM[""],CS[Cartesian,2],AXIS["(E)",east,ORDER[1],LENGTHUNIT["unknown",0]],AXIS["(N)",north,ORDER[2],LENGTHUNIT["unknown",0]]]
{
  "$schema": "https://proj.org/schemas/v0.5/projjson.schema.json",
  "type": "EngineeringCRS",
  "name": "Undefined Cartesian SRS with unknown unit",
  "datum": {
    "name": ""
  },
  "coordinate_system": {
    "subtype": "Cartesian",
    "axis": [
      {
        "name": "Easting",
        "abbreviation": "E",
        "direction": "east",
        "unit": {
          "type": "LinearUnit",
          "name": "unknown",
          "conversion_factor": 0
        }
      },
      {
        "name": "Northing",
        "abbreviation": "N",
        "direction": "north",
        "unit": {
          "type": "LinearUnit",
          "name": "unknown",
          "conversion_factor": 0
        }
      }
    ]
  }
}

Because the unit now gets a type, name and conversion factor, perhaps this is a way forward to let us migrate to GPKG where SRS are planar and units are unknown?

rsbivand commented 1 year ago

See also https://mastodon.social/@edzer/109438013774699729

rouault commented 1 year ago

if you really intend to use conversion_factor=0, I'd suggest you add a test case in PROJ test suite to check that we can parse such CRS (in test/unit/test_io.cpp). Otherwise, it might be possible that someone in a few years forgets that use case, and facing an issue related to the ones I mentioned in https://lists.osgeo.org/pipermail/proj/2022-December/010842.html, just decides to reject such CRS on import.

rsbivand commented 1 year ago

The tests seem to run OK with: https://github.com/rsbivand/proj.4/commit/4ace26fbe4f8357dcedb336bf56c232d894c72fb . Should I specifically test the unit name and conversion factor (added in https://github.com/rsbivand/proj.4/commit/ab825221823929253b4c17b111734473009bfc5d)? @rouault, may I wait until I've checked a little more (@edzer ?) with R packages before going to a PR? Should we use EDATUM["Engineering datum"], or is EDATUM[""] equivalent?

rouault commented 1 year ago

Should I specifically test the unit name and conversion factor?

the conversion factor.

may I wait until I've checked a little more

sure, I meant if you intend that to be supported long term.

Should we use EDATUM["Engineering datum"], or is EDATUM[""] equivalent?

Reviewing the WKT CRS spec, I believe the empty string is not technically legal. There should be at least one character in a quote string acording to the BNF grammar. I'm going to fix PROJ to expand LOCAL_CS["foo"] as ENGCRS["foo",EDATUM["Unknown datum"],etc.

rsbivand commented 1 year ago

Conversion factor included as per https://github.com/rsbivand/proj.4/commit/ab825221823929253b4c17b111734473009bfc5d. I've rebuilt and run ctest without anything apparently breaking - does this set of tests get run in Test #48: testprojinfo?

So "Engineering datum" in EDATUM[] would be explicit, but "Unknown datum" would be a clearer description, is that your meaning?

rsbivand commented 1 year ago

"Unknown datum" name test added in https://github.com/rsbivand/proj.4/commit/8e7aa36bb22c603427d74f964fcd8e79ba62006d.

rouault commented 1 year ago

I'm going to fix PROJ to expand LOCAL_CS["foo"] as ENGCRS["foo",EDATUM["Unknown datum"],etc.

submitted in https://github.com/OSGeo/PROJ/pull/3491

rouault commented 1 year ago

I've rebuilt and run ctest without anything apparently breaking - does this set of tests get run in Test #48: testprojinfo?

no, in "Test #54: proj_test_cpp_api"

rsbivand commented 1 year ago

So:

      Start 54: proj_test_cpp_api
54/60 Test #54: proj_test_cpp_api ................   Passed   15.12 sec

indicates passing with my addition, now with "Unknown datum" in too.

rouault commented 1 year ago

The east & north directions might be even presumptuous for a unknown CRS. Perhaps using the "unspecified" direction would be better

ENGCRS["foo",
    EDATUM["Unknown engineering datum"],
    CS[Cartesian,2],
        AXIS["X",unspecified,
            ORDER[1],
            LENGTHUNIT["unknown",0]],
        AXIS["Y",unspecified,
            ORDER[2],
            LENGTHUNIT["unknown",0]]]

... But ... I'm not totally sure that unspecified is legal for a Cartesian CS in a engineering CRS. http://docs.opengeospatial.org/is/18-010r7/18-010r7.html#48 mentions """"f) In engineering CRSs the horizontal directions are only approximate, the set of directions indicating whether the coordinate system is left-handed or right-handed. (In the 2D case, the handedness is when viewed from above the plane of the system). For engineering CRSs with polar coordinate systems the direction of the rotational axis shall be 'clockwise' or 'counterClockwise'. The specified direction from which the rotation is measured shall be given through the supplementary object ; the bearing value shall be given in the unit defined through .""" So perhaps "unspecified" is not good as it doesn't allow to distinguish a lef-handed from a righ-handed CS.

"unspecified" would be clearly illegal for a projected CRS: """c) For projected CRSs, except for coordinate systems centred on a pole, the horizontal axis direction shall be 'north' or 'south' and 'east' or 'west'."""

rsbivand commented 1 year ago

Yes, I'd thought of "X" and "Y" too. I would not think that describing "X" as "horizontal" and "Y" as "vertical" as impossible (in 2D Cartesian), are they permitted tokens? Could "vertical" be taken as implying height?

rouault commented 1 year ago

the "X" or "Y" strings are axis names, governed by http://docs.opengeospatial.org/is/18-010r7/18-010r7.html#47, which are governed by a few conventions described in that paragraph. On the contrary "unspecified" is one of the members of the enumerated list. in the context of WKT CRS / ISO 19111, "vertical" implies indeed some sort of height.

rsbivand commented 1 year ago

Can I test for the unspecified token in each axis? EXPECT_EQ(axis0->direction(), AxisDirection::UNSPECIFIED); seems to work? Pushed as https://github.com/rsbivand/proj.4/commit/07a70d3f2397a627b4ae6855afb35cf6e47f9b49, passing for me.

rsbivand commented 1 year ago

Linking to https://mastodon.social/@edzer/109438013774699729 and subsequent; no reverse dependency failures that can be attributed to using:

ENGCRS["Undefined Cartesian SRS with unknown unit",
    EDATUM["Unknown engineering datum"],
    CS[Cartesian,2],
        AXIS["X",unspecified,
            ORDER[1],
            LENGTHUNIT["unknown",0]],
        AXIS["Y",unspecified,
            ORDER[2],
            LENGTHUNIT["unknown",0]]]

for "most" sf reverse dependencies for the "GPKG" vector driver. Next is to see if any mayhem appears if the unknown planar CRS is used for any write driver (my rig does not test the DB drivers like postgis). sf itself fails on test_tm.R:12 and test_tm.R:29, because instead of NA, the CRS is now "Undefined Cartesian SRS with unknown unit":

> readLines(gsub("\\.shp", "\\.prj", shp))
[1] "LOCAL_CS[\"Undefined Cartesian SRS with unknown unit\",LOCAL_DATUM[\"Unknown engineering datum\",32767],UNIT[\"unknown\",0.0],AXIS[\"X\",OTHER],AXIS[\"Y\",OTHER]]"
rsbivand commented 1 year ago

No extra failures resulting from using "Undefined Cartesian SRS with unknown unit" for all sf::st_write() and sf::write_sf() in the "most" sf reverse dependencies. I'll try spData next @Nowosad .

rsbivand commented 1 year ago

@rouault would you like me to create a PR including the new test?

rouault commented 1 year ago

would you like me to create a PR including the new test?

yes, please do.

rsbivand commented 1 year ago

Maybe it would be better to lift https://github.com/rsbivand/proj.4/blob/07a70d3f2397a627b4ae6855afb35cf6e47f9b49/test/unit/test_io.cpp#L5268-L5299 directly, as on starting a PR I see that my fork has redundant copies of two other files which should not be proposed for merging. I do not know how to tell github to only choose changes to a single file, dropping the two others.

rouault commented 1 year ago

I do not know how to tell github to only choose changes to a single file, dropping the two others.

something along

git checkout master
git pull origin master
git checkout -b unknown_unit
git cherry-pick 4ace26fbe4f8357dcedb336bf56c232d894c72fb ab825221823929253b4c17b111734473009bfc5d 8e7aa36bb22c603427d74f964fcd8e79ba62006d 07a70d3f2397a627b4ae6855afb35cf6e47f9b49
git push {your_remote} unknown_unit
rsbivand commented 1 year ago

Sorry, that didn't work. I'm not sure which order the pushes should be cherry-picked. I'd like to avoid having to re-fork if possible to get a clean repo from which to re-start.

rouault commented 1 year ago

I'd like to avoid having to re-fork if possible to get a clean repo from which to re-start.

you shouldn't need to re-fork with the correct git invocations (sometimes a git reset --hard origin/master is required to resync local master branch with upstream one, if you've messed up with your local master branch, but this might eat your children if you invoke it on a local branch that is not master) :-). But it looks like you just did something to your repo, as https://github.com/rsbivand/proj.4/blob/07a70d3f2397a627b4ae6855afb35cf6e47f9b49/test/unit/test_io.cpp#L5268-L5299 no longer resolves. Well, if you have the diff in some form, just paste it and I'll upstream it.

rsbivand commented 1 year ago

Yes, I deleted the fork (was proj.4 anyway, so pretty old). Re-establishing with a new fork, hopefully cleanly.

rsbivand commented 1 year ago

https://github.com/OSGeo/PROJ/pull/3494 created.

edzer commented 1 year ago

Fantastic. Shall we substitute this WKT definition for the current st_crs(NA) in sf?

rsbivand commented 1 year ago

PR created in can_transform branch.

edzer commented 1 year ago

So we now have a valid WKT crs that (currently) is not identical to NA crs:

> str(sf:::NA_crs_)
List of 2
 $ input: chr NA
 $ wkt  : chr NA
 - attr(*, "class")= chr "crs"
> sf:::is.na.crs
function (x) 
{
    identical(x, NA_crs_)
}
<bytecode: 0x560a00ad6218>
<environment: namespace:sf>

but that behaves identically: Cartesian, unknown units. This breaks measures (length, area) because not NA but also no units - so far CRS with a WKT were assumed to have units (wrongly substituting [m] when unkonwn). Shall we substitute this WKT definition for the current sf::NA_crs_ in sf? The advantage is that it is more descriptive: for an NA CRS one could still argue it is unknown whether it is Cartesian or geographic, but all code in sf (and earlier sp) goes the Cartesian path for NA crs.

rsbivand commented 1 year ago

I think that I'd tend to view geographical as implying that we know more than planar. I sense that raster went the other way, but I don't find any very strong argument for then saying that coordinates that would be valid within +/- 90 and +/- 180 or 0-360 should be geographical and others planar. I think planar is the position that assumes least.

edzer commented 1 year ago

This came up here: https://github.com/r-spatial/stars/issues/622#issuecomment-1485084554 when I tried to assign the "Undefined Cartesian SRS with unknown unit" to the GeoTIFF and warp it using the GPKG with the same SRS:

In CPL_gdalwarp(source, destination, options, oo, doo, config_options,  :
  GDAL Error 6: Cannot find coordinate operations from 
`ENGCRS["Undefined Cartesian SRS with unknown unit",EDATUM[""],CS[Cartesian,2],AXIS["(E)",east,ORDER[1],LENGTHUNIT["metre",1,ID["EPSG",9001]]],AXIS["(N)",north,ORDER[2],LENGTHUNIT["metre",1,ID["EPSG",9001]]]]' to 
`ENGCRS["Undefined Cartesian SRS with unknown unit",EDATUM["Unknown engineering datum"],CS[Cartesian,2],AXIS["x",unspecified,ORDER[1],LENGTHUNIT["unknown",0]],AXIS["y",unspecified,ORDER[2],LENGTHUNIT["unknown",0]]]'

which can be reproduced command line with

$ gdalwarp noise.tif test.tiff -cutline buf.gpkg 
Processing noise.tif [1/1] : 0ERROR 6: Cannot find coordinate operations from 
`ENGCRS["Undefined Cartesian SRS with unknown unit",EDATUM[""],CS[Cartesian,2],AXIS["(E)",east,ORDER[1],LENGTHUNIT["metre",1,ID["EPSG",9001]]],AXIS["(N)",north,ORDER[2],LENGTHUNIT["metre",1,ID["EPSG",9001]]]]' to 
`ENGCRS["Undefined Cartesian SRS with unknown unit",EDATUM["Unknown engineering datum"],CS[Cartesian,2],AXIS["x",unspecified,ORDER[1],LENGTHUNIT["unknown",0]],AXIS["y",unspecified,ORDER[2],LENGTHUNIT["unknown",0]]]'

files.zip

Reprex in R:

library("stars")
n = 2000^2
v = rnorm(n)
bbox = st_bbox(c(xmin = 0, xmax = 1, ymin = 0, ymax = 1))
r = st_as_stars(bbox, nx = 2000, ny = 2000, values = v)
pt = st_sfc(st_point(c(0.5, 0.5)))
buff = st_buffer(pt, dist = 0.4)
tmp1 = tempfile(fileext = ".tif")
tmp2 = tempfile(fileext = ".gpkg")
write_sf(buff, tmp2)
st_crs(r) = st_crs(st_read(tmp2))
write_stars(r, tmp1)
gdal_utils(util = "warp", source = tmp1, destination = "test.tiff", options = c("-cutline", tmp2))
rouault commented 1 year ago

which can be reproduced command line with

should be fixed per https://github.com/OSGeo/PROJ/pull/3685