r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.35k stars 299 forks source link

Using a custom pipeline in st_transform() seems to not result in an updated CRS #2439

Closed florisvdh closed 2 months ago

florisvdh commented 2 months ago

Describe the bug

In https://www.paulamoraga.com/book-spatial/point-process-modeling.html#observed-solanum-plant-species-in-bolivia, a PROJ string is being used to update the units of EPSG:5356 from meters to kilometers. However this is problematic since most datums, including the applicable "Marco Geodesico Nacional de Bolivia", are not supported in proj-strings for CRSs in modern PROJ versions. (To be more specific, PROJ strings are not recommended to define a CRS anymore.)

I wanted to use the modern approach for coordinate conversion, i.e. using a PROJ string to define the operation (not the target CRS in itself). For this, I used st_transform(<sfc-object>, pipeline = <proj-string>).

While this appears to be successful in updating the coordinates, the CRS of the resulting sfc object is lost. Maybe this is because the target CRS is not defined by an authority, but I assume it should actually result in an updated WKT2:2019.

To Reproduce

library(sf)
#> Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE

point_5356 <- 
  st_point(c(445761.6, 8064389)) |> 
  st_sfc(crs = 5356)

st_coordinates(point_5356)
#>             X       Y
#> [1,] 445761.6 8064389

st_crs(point_5356)$Name
#> [1] "MARGEN / UTM zone 19S"

point_5356_km <- 
  point_5356 |> 
  st_transform(pipeline = "+proj=unitconvert +xy_out=km")

st_coordinates(point_5356_km)
#>             X        Y
#> [1,] 445.7616 8064.389

st_crs(point_5356_km)
#> Coordinate Reference System: NA

sf::sf_extSoftVersion()
#>           GEOS           GDAL         proj.4 GDAL_with_GEOS     USE_PROJ_H 
#>       "3.12.1"        "3.8.4"        "9.3.1"         "true"         "true" 
#>           PROJ 
#>        "9.3.1"

Created on 2024-09-14 with reprex v2.1.1

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.4.1 (2024-06-14) #> os Linux Mint 21.3 #> system x86_64, linux-gnu #> ui X11 #> language nl_BE:nl #> collate nl_BE.UTF-8 #> ctype nl_BE.UTF-8 #> tz Europe/Brussels #> date 2024-09-14 #> pandoc 3.1.11 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> class 7.3-22 2023-05-03 [3] RSPM (R 4.2.0) #> classInt 0.4-10 2023-09-05 [3] RSPM (R 4.3.0) #> cli 3.6.3 2024-06-21 [3] RSPM (R 4.4.0) #> DBI 1.2.3 2024-06-02 [3] RSPM (R 4.4.0) #> digest 0.6.37 2024-08-19 [3] RSPM (R 4.4.0) #> e1071 1.7-14 2023-12-06 [3] RSPM (R 4.3.0) #> evaluate 0.24.0 2024-06-10 [3] RSPM (R 4.4.0) #> fastmap 1.2.0 2024-05-15 [3] RSPM (R 4.4.0) #> fs 1.6.4 2024-04-25 [3] RSPM (R 4.3.0) #> glue 1.7.0 2024-01-09 [3] RSPM (R 4.3.0) #> htmltools 0.5.8.1 2024-04-04 [3] RSPM (R 4.3.0) #> KernSmooth 2.23-24 2024-05-17 [3] RSPM (R 4.4.0) #> knitr 1.48 2024-07-07 [3] RSPM (R 4.4.0) #> lifecycle 1.0.4 2023-11-07 [3] RSPM (R 4.3.0) #> magrittr 2.0.3 2022-03-30 [3] RSPM (R 4.2.0) #> proxy 0.4-27 2022-06-09 [3] RSPM (R 4.2.0) #> Rcpp 1.0.13 2024-07-17 [3] RSPM (R 4.4.0) #> reprex 2.1.1 2024-07-06 [3] RSPM (R 4.4.0) #> rlang 1.1.4 2024-06-04 [3] RSPM (R 4.4.0) #> rmarkdown 2.28 2024-08-17 [3] RSPM (R 4.4.0) #> rstudioapi 0.16.0 2024-03-24 [3] RSPM (R 4.3.0) #> sessioninfo 1.2.2 2021-12-06 [3] RSPM (R 4.2.0) #> sf * 1.0-17 2024-09-06 [1] RSPM (R 4.4.1) #> units 0.8-5 2023-11-28 [3] RSPM (R 4.3.0) #> withr 3.0.1 2024-07-31 [3] RSPM (R 4.4.0) #> xfun 0.47 2024-08-17 [3] RSPM (R 4.4.0) #> yaml 2.3.10 2024-07-26 [3] RSPM (R 4.4.0) #> #> [1] /home/floris/lib/R/library #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

Additional context

The CRS obtained in the book, using a PROJ string CRS definition without datum (see below), still returns a WKT but clearly lacks the datum:

> st_crs(map)
Coordinate Reference System:
  User input: +proj=utm +zone=19 +south +ellps=GRS80 
                +towgs84=0,0,0,0,0,0,0 +units=km +no_defs 
  wkt:
BOUNDCRS[
    SOURCECRS[
        PROJCRS["unknown",
            BASEGEOGCRS["unknown",
                DATUM["Unknown based on GRS 1980 ellipsoid using towgs84=0,0,0,0,0,0,0",
# and so on

Compare with the source CRS WKT:

> st_crs(5356)
Coordinate Reference System:
  User input: EPSG:5356 
  wkt:
PROJCRS["MARGEN / UTM zone 19S",
    BASEGEOGCRS["MARGEN",
        DATUM["Marco Geodesico Nacional de Bolivia",
edzer commented 2 months ago

I've wondered this too, and am unsure this is solvable at the level of sf - my impression is that PROJ (or GDAL) can create the (set of) pipeline(s) to go from CRS_source to CRS_target (or the OGRCoordinateTransformation object which does this in case of GDAL), but given CRS_source and a pipeline, it cannot give CRS_target (in terms of an EPSG ID, or even WKT). Is that right @rouault ? If I'm wrong, how can it give CRS_target?

rouault commented 2 months ago

given CRS_source and a pipeline, it cannot give CRS_target (in terms of an EPSG ID, or even WKT). Is that right ?

yes, this is not really a solvable issue. In some cases, the source CRS + pipeline could in theory result in something that can be expressed as a CRS WKT (and in the most favorable cases matching a EPSG id), but coding such an algorithm would be extremely hard, a never ending task. Could be done for simple cases like recognizing unit changes, but for more complex pipelines, you'd have to recognize that using grid XXX means that you are changing to datum YYY, and in some cases there would be ambiguities because the same grid can be used for example to transform from an old national geodetic datum to a "modern" national geodetic datum ... or WGS 84 as an approximation of it.

In the situation expressed here, a (advanced) user can create the target CRS WKT from the source CRS WKT by just changing the units.

PROJCRS["MARGEN / UTM zone 19S (in kilometre)",
    BASEGEOGCRS["MARGEN",
        DATUM["Marco Geodesico Nacional de Bolivia",
            ELLIPSOID["GRS 1980",6378137,298.257222101,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",5354]],
    CONVERSION["UTM zone 19S",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",-69,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",0.9996,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",500000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",10000000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["kilometre",1000]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["kilometre",1000]]]

But for that particular use case of changing the units, there is a much easier way, by using proj_crs_alter_cs_linear_unit to directly obtain the desired target CRS

florisvdh commented 2 months ago

In the situation expressed here, a (advanced) user can create the target CRS WKT from the source CRS WKT by just changing the units.

Great! Applied this in R going through PROJJSON:

library(sf)
#> Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE
library(purrr)

point_5356_km <- 
  st_point(c(445761.6, 8064389)) |> 
  st_sfc(crs = 5356) |> 
  st_transform(pipeline = "+proj=unitconvert +xy_out=km")

st_crs(point_5356_km)
#> Coordinate Reference System: NA

wkt_5356_km <- 
  PROJ::proj_crs_text("EPSG:5356", 2) |> 
  jsonlite::fromJSON() |> 
  # update units:
  assign_in(
    list("coordinate_system", "axis", "unit"),
    data.frame(
      type = rep("LinearUnit", 2),
      name = rep("kilometre", 2),
      conversion_factor = rep(1000, 2)
    )
  ) |> 
  # drop authority & code:
  assign_in("id", zap()) |> 
  # update name:
  modify_in("name", ~ paste(., "(in kilometre)")) |> 
  jsonlite::toJSON(auto_unbox = TRUE) |> 
  PROJ::proj_crs_text()

st_crs(point_5356_km) <- wkt_5356_km

point_5356_km
#> Geometry set for 1 feature 
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 445.7616 ymin: 8064.389 xmax: 445.7616 ymax: 8064.389
#> Projected CRS: MARGEN / UTM zone 19S (in kilometre)
#> POINT (445.7616 8064.389)

cat(st_crs(point_5356_km)$wkt)
#> PROJCRS["MARGEN / UTM zone 19S (in kilometre)",
#>     BASEGEOGCRS["MARGEN",
#>         DATUM["Marco Geodesico Nacional de Bolivia",
#>             ELLIPSOID["GRS 1980",6378137,298.2572,
#>                 LENGTHUNIT["metre",1]]],
#>         PRIMEM["Greenwich",0,
#>             ANGLEUNIT["degree",0.0174532925199433]],
#>         ID["EPSG",5354]],
#>     CONVERSION["UTM zone 19S",
#>         METHOD["Transverse Mercator",
#>             ID["EPSG",9807]],
#>         PARAMETER["Latitude of natural origin",0,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8801]],
#>         PARAMETER["Longitude of natural origin",-69,
#>             ANGLEUNIT["degree",0.0174532925199433],
#>             ID["EPSG",8802]],
#>         PARAMETER["Scale factor at natural origin",0.9996,
#>             SCALEUNIT["unity",1],
#>             ID["EPSG",8805]],
#>         PARAMETER["False easting",500000,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8806]],
#>         PARAMETER["False northing",10000000,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8807]]],
#>     CS[Cartesian,2],
#>         AXIS["(E)",east,
#>             ORDER[1],
#>             LENGTHUNIT["kilometre",1000]],
#>         AXIS["(N)",north,
#>             ORDER[2],
#>             LENGTHUNIT["kilometre",1000]],
#>     USAGE[
#>         SCOPE["Engineering survey, topographic mapping."],
#>         AREA["Bolivia - west of 66°W."],
#>         BBOX[-22.91,-69.66,-9.77,-66]]]

Created on 2024-09-14 with reprex v2.1.1

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.4.1 (2024-06-14) #> os Linux Mint 21.3 #> system x86_64, linux-gnu #> ui X11 #> language nl_BE:nl #> collate nl_BE.UTF-8 #> ctype nl_BE.UTF-8 #> tz Europe/Brussels #> date 2024-09-14 #> pandoc 3.1.11 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> class 7.3-22 2023-05-03 [3] RSPM (R 4.2.0) #> classInt 0.4-10 2023-09-05 [3] RSPM (R 4.3.0) #> cli 3.6.3 2024-06-21 [3] RSPM (R 4.4.0) #> DBI 1.2.3 2024-06-02 [3] RSPM (R 4.4.0) #> digest 0.6.37 2024-08-19 [3] RSPM (R 4.4.0) #> e1071 1.7-14 2023-12-06 [3] RSPM (R 4.3.0) #> evaluate 0.24.0 2024-06-10 [3] RSPM (R 4.4.0) #> fastmap 1.2.0 2024-05-15 [3] RSPM (R 4.4.0) #> fs 1.6.4 2024-04-25 [3] RSPM (R 4.3.0) #> glue 1.7.0 2024-01-09 [3] RSPM (R 4.3.0) #> htmltools 0.5.8.1 2024-04-04 [3] RSPM (R 4.3.0) #> jsonlite 1.8.8 2023-12-04 [3] RSPM (R 4.3.0) #> KernSmooth 2.23-24 2024-05-17 [3] RSPM (R 4.4.0) #> knitr 1.48 2024-07-07 [3] RSPM (R 4.4.0) #> lifecycle 1.0.4 2023-11-07 [3] RSPM (R 4.3.0) #> magrittr 2.0.3 2022-03-30 [3] RSPM (R 4.2.0) #> PROJ 0.5.0 2024-06-12 [1] RSPM (R 4.4.1) #> proxy 0.4-27 2022-06-09 [3] RSPM (R 4.2.0) #> purrr * 1.0.2 2023-08-10 [3] RSPM (R 4.2.0) #> Rcpp 1.0.13 2024-07-17 [3] RSPM (R 4.4.0) #> reprex 2.1.1 2024-07-06 [3] RSPM (R 4.4.0) #> rlang 1.1.4 2024-06-04 [3] RSPM (R 4.4.0) #> rmarkdown 2.28 2024-08-17 [3] RSPM (R 4.4.0) #> rstudioapi 0.16.0 2024-03-24 [3] RSPM (R 4.3.0) #> sessioninfo 1.2.2 2021-12-06 [3] RSPM (R 4.2.0) #> sf * 1.0-17 2024-09-06 [1] RSPM (R 4.4.1) #> units 0.8-5 2023-11-28 [3] RSPM (R 4.3.0) #> vctrs 0.6.5 2023-12-01 [3] RSPM (R 4.3.0) #> withr 3.0.1 2024-07-31 [3] RSPM (R 4.4.0) #> wk 0.9.3 2024-09-06 [3] RSPM (R 4.4.0) #> xfun 0.47 2024-08-17 [3] RSPM (R 4.4.0) #> yaml 2.3.10 2024-07-26 [3] RSPM (R 4.4.0) #> #> [1] /home/floris/lib/R/library #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
florisvdh commented 2 months ago

Closing as the general case of updating a CRS after a custom pipeline transformation is not solvable.

Thank you for the help!