r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.33k stars 293 forks source link

Upgrading legacy R code from proj4 to proj6 #1551

Closed EuanGMason closed 3 years ago

EuanGMason commented 3 years ago

I really appreciate all the expertise and the time that has gone into improving CRSs with new versions of rgdal and sp.

It would be very helpful if sample R code could be developed showing how to use proj 6. There are millions of GIS layers out there with embedded proj4string CRSs, and we now find that all our legacy code doesn't run with the latest rgdal and sp libraries. Some examples showing how to deal with this would be a great help. For instance:

a) I recently developed object-oriented R code to represent models of forest growth based on tree physiology. As part of the system I need to allow users to enter positions of measured forest plots in NZTM (2193) which is a much used New Zealand GPS standard, and then to run the code I need WGS84 latlong (4326) to get solar radiation and relative positions of meteorological estimates. On my system, with R 3.6 and older sp and rgdal libraries, I simply use spTransform and it works perfectly, albeit with a few warnings about impending doom for proj4strings. I sent a workspace to some clients for comment and they installed R 4.0 and the latest rgdal & sp libraries. They encountered fatal errors when establishing links to rasters with proj4strings and when they hit an object with an embedded spTransform command. Old versions of rgdal will not install in R 4.0. I'd rather update the code to comply with the latest R versions and libraries but on-line resources are unclear.

b) I often make raster layers for clients with landscape level estimates of site productivity based on tree physiology, soils info and meterological estimates. I use rasterFromXYZ, which requires a CRS in proj4string format, but latest rgdal and sp libraries don't allow access to GPS info using rasters with proj4string CRSs. How can I do this with a proj 6 CRS?

Thanks, Euan Mason

edzer commented 3 years ago

I think we pointed this out here. Could you please provide a simple reprex that causes problems for you?

EuanGMason commented 3 years ago

Here's a simple example:

pnt=data.frame(x=EastingI,y=NorthingI) pnt=SpatialPoints(pnt,CRS(SRS_string = st_crs(2193)$wkt)) #NZTM pnt=spTransform(pnt,CRS(SRS_string = st_crs(4326)$wkt)) #WGS84

This works fine with R 3.6, rgdal 1.5.16 and sp 1.4.2, but not with R 4.0, rgdal 1.5.18 and sp 1.4.4.

Another example that works with old R & libraries but not with upgrades (validPoints has about 10 million records):

SIValid<-data.table(x=validPoints$x,y=validPoints$y,SIPred=validPoints$SIPredA) setkey(SIValid,"x","y") SIRaster<-rasterFromXYZ(SIValid,crs=CRS("+proj=tmerc +lat_0=0 +lon_0=173 +k=0.9996 +x_0=1600000 +y_0=10000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs")) writeRaster(SIRaster,filename="SIRasterV2L.tif",format="GTiff",overwrite=T) rm(SIValid) SIRaster=raster("SIRasterV2L.tif")

Thanks, Euan

rsbivand commented 3 years ago

This recent tweet https://twitter.com/RogerBivand/status/1334919051932479488, and very many postings on R-sig-geo over the last several years are sufficient warning that those offering services need always to track releases preemptively (@edzer is quite right, users are supposed to read blogs). You have not provided a reproducible example, not the output of sessionInfo(). I'll follow up shortly using Windows CRAN binaries, but all of this is down to your not tracking changes and ending up with unhappy customers.

rsbivand commented 3 years ago

On Windows, R 4.0.3, current CRAN binaries of raster, ""sp, rgdal and sf**, there are no errors for one point (taken arbitrarily as (500000, 500000).

My suspicion is that the stored WKT may be a BOUNDCRS, which would cause trouble for transformation. I can get a BOUNDCRS from:

rgdal::set_prefer_proj(FALSE)
aa <- CRS("+proj=tmerc +lat_0=0 +lon_0=173 +k=0.9996 +x_0=1600000 +y_0=10000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs", get_source_if_boundcrs=FALSE)

Without +towgs84=0,0,0,0,0,0,0, even using GDAL rather than PROJ to instantiate CRS (see https://twitter.com/geospacedman/status/1321206393152692229), I get a straight PROJCRS. I suspect that the difference in user experience is in 1) different raster versions (most likely), 2) different approach to admitting BOUNDCRS in rgdal carefully noted in https://cran.r-project.org/web/packages/rgdal/news/news.html, 3) customers using wierd PROJ/GDAL version combinations and installing from source R packages from source (cloud nodes, defective container).

We need to see the script source as well as the description of the compute environment. An affiliation would also be helpful - joining github just to post a question suggests that this never was an sf issue, and should be diverted to R-sig-geo for public discussion.

rsbivand commented 3 years ago

Further, for current raster, the claim that "rasterFromXYZ ... requires a CRS in proj4string format" is incorrect:

crs: CRS object or a character string describing a projection and datum in PROJ.4 format

rsbivand commented 3 years ago

A recent lecture will be online soon: https://youtu.be/2H1Tn4oN32M with presentation and link to scripts on https://rsbivand.github.io/ECS530_h20/ECS530_III.html

edzer commented 3 years ago

I'd be happy to reopen this once a reprex is provided.

DOSull commented 3 years ago

I wonder if this is a problem in some way specific to EPSG:2193. I've been working up teaching material for interpolation, using sf, sp, raster and gstat, and keep running into problems that appear to be related to discarded datums.

Without getting into the details of my interpolation code (where I eventually have to do a as(..., "Spatial") conversion and problems arise, I've been trying to figure things out at a more basic level. If I examine EPSG:2193, per the code on this post: https://cran.r-project.org/web/packages/rgdal/vignettes/CRS_projections_transformations.html then here's what I see

> cat(sp::wkt(sp::CRS(SRS_string = "EPSG:2193")))
PROJCRS["NZGD2000 / New Zealand Transverse Mercator 2000 (with axis order normalized for visualization)",
    BASEGEOGCRS["NZGD2000",
        DATUM["New Zealand Geodetic Datum 2000",
            ELLIPSOID["GRS 1980",6378137,298.257222101,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4167]],
    CONVERSION["New Zealand Transverse Mercator 2000",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",173,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",0.9996,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",1600000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",10000000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]],
        ID["EPSG",19971]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]]]Warning message:
In showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj = prefer_proj) :
  Discarded datum New Zealand Geodetic Datum 2000 in Proj4 definition

That discarded datum warning suggests that even a properly authorized instance of EPSG:2193 may lead to problems when moving data between formats. Sure enough a file whose CRS is recognised by QGIS 3.18 as EPSG:2193 is read by sf without problems:

nz_sf <- sf::st_read("nz-2193.gpkg")

but

nz_sp <- rgdal::readOGR("nz-2193.gpkg")

reports

Warning message:
In OGRSpatialRef(dsn, layer, morphFromESRI = morphFromESRI, dumpSRS = dumpSRS,  :
  Discarded datum New_Zealand_Geodetic_Datum_2000 in Proj4 definition: +proj=tmerc +lat_0=0 +lon_0=173 +k=0.9996 +x_0=1600000 +y_0=10000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs

Having said that st_crs(nz_sf)$wkt == wkt(nz_sp) reassuringly returns TRUE.

However, If subsequently I need to do as(nz_sf, "Spatial") I get the discarded datum warning and subsequently have problems with mismatched. missing or unknown CRS.

I guess I am just confused what it is I can do as a user to resolve these issues. Converting from sf to sp is something I need to do relatively often and if I am working in EPSG:2193 it leads to problems. Even something like

crs(nz_sp) <- CRS(SRS_string = st_crs(nz_sf)$wkt)

produces the dread dropped datum warning.

Should we (in New Zealand) be lobbying a relevant authority (Land Information New Zealand) to update the CRS metadata? The problem seems to be the unwillingness of gdal and various other parts of the stack to deal with GRS80.

I should note that if I am working exclusively in sf (which I generally try to do) then none of these problems arise, and as advertised my recent move to gdal 3 and proj>6 has been invisible. I'm using GEOS 3.8.0, GDAL 3.0.4, PROJ 7.0.0 and here's the output from sessionInfo()

R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_NZ.UTF-8       LC_NUMERIC=C               LC_TIME=en_NZ.UTF-8        LC_COLLATE=en_NZ.UTF-8    
 [5] LC_MONETARY=en_NZ.UTF-8    LC_MESSAGES=en_NZ.UTF-8    LC_PAPER=en_NZ.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_NZ.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] sf_0.9-8     raster_3.4-5 sp_1.4-5    

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6         magrittr_2.0.1     units_0.7-1        tidyselect_1.1.0   lattice_0.20-41   
 [6] R6_2.5.0           rlang_0.4.10       fansi_0.4.2        dplyr_1.0.5        tools_4.0.3       
[11] grid_4.0.3         utf8_1.2.1         KernSmooth_2.23-18 e1071_1.7-6        DBI_1.1.1         
[16] ellipsis_0.3.1     class_7.3-18       assertthat_0.2.1   tibble_3.1.0       lifecycle_1.0.0   
[21] crayon_1.4.1       purrr_0.3.4        vctrs_0.3.7        codetools_0.2-18   glue_1.4.2        
[26] proxy_0.4-25       compiler_4.0.3     pillar_1.5.1       generics_0.1.0     classInt_0.4-3    
[31] pkgconfig_2.0.3 

I've also attached that file, should you want to poke it. nz-2193-gpkg.zip

DOSull commented 3 years ago

As an addendum to the above. I spent some time trying to make a reprex. With extremely careful attention to the various blog posts and documentation it is possible to move data around among sf, sp, gstat, raster and so on, while ignoring numerous warnings about dropped datums, and still get things to work.

It's not a particularly frictionless experience however, which is a bit sad, having become used to how smooth the R-spatial ecosystem had become recently. It's far from an ideal situation for persuading students that this really is a better platform to be working in than desktop GIS!

Here's hoping that it won't be too long before some of the wrinkles get ironed out, and I don't have to be doing things like

crs(some_raster) <- st_crs(some_sf)$wkt

any more, and can stop telling my students to relax and not worry about all those damned dropped datum messages... :-)

rsbivand commented 3 years ago

@DOSull: well, the warnings got your attention, which was their intention. sp and raster use a "CRS" object, which remains the old object (with a "projargs" slot) but with a WKT2 object in a comment if read from file through an up-to-date set of packages. Remaining problems can come when stored objects are bundled with packages or tutorials without WKT2 comments. They, and labs that do not update packages regularly, are why the warnings are still there. You will have noticed that rgdal does say clearly when it loads:

To mute warnings of possible GDAL/OSR exportToProj4() degradation, use options("rgdal_show_exportToProj4_warnings"="none") before loading rgdal.

Maybe try it?

DOSull commented 3 years ago

You're right it got my attention! It's just a bit fugly at the moment, which is frustrating.

Concerning my question about EPSG:2913 (the national standard map projection around these parts) and the fact it throws warnings due to GRS80, what's the workaround there? Should LINZ be somehow updating the metadata for that projection? Is it doomed to throw these warnings for ever more?

edzer commented 3 years ago

having become used to how smooth the R-spatial ecosystem had become recently.

Much of what you experience is our adaptation to OSGEO, the development of PROJ 4 - 5 - 6 - 7- 8 over a few years time and our attempts of understanding all that and reacting to it.

rsbivand commented 3 years ago

If you look at:

$ projinfo EPSG:2193
PROJ.4 string:
+proj=tmerc +lat_0=0 +lon_0=173 +k=0.9996 +x_0=1600000 +y_0=10000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs +type=crs

WKT2:2019 string:
PROJCRS["NZGD2000 / New Zealand Transverse Mercator 2000",
    BASEGEOGCRS["NZGD2000",
        DATUM["New Zealand Geodetic Datum 2000",
            ELLIPSOID["GRS 1980",6378137,298.257222101,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4167]],
    CONVERSION["New Zealand Transverse Mercator 2000",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",173,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",0.9996,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",1600000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",10000000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Engineering survey, topographic mapping."],
        AREA["New Zealand - North Island, South Island, Stewart Island - onshore."],
        BBOX[-47.33,166.37,-34.1,178.63]],
    ID["EPSG",2193]]

the WKT2 2019 is fully specified as submitted to EPSG, I think. I would have been happier if PROJ desisted from emitting +towgs84=0,0,0,0,0,0,0 which is deprecated, and here a no-op anyway. For sf:

> st_crs("EPSG:2193")
Coordinate Reference System:
  User input: EPSG:2193 
  wkt:
PROJCRS["NZGD2000 / New Zealand Transverse Mercator 2000",
    BASEGEOGCRS["NZGD2000",
        DATUM["New Zealand Geodetic Datum 2000",
            ELLIPSOID["GRS 1980",6378137,298.257222101,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4167]],
    CONVERSION["New Zealand Transverse Mercator 2000",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",173,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",0.9996,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",1600000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",10000000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Engineering survey, topographic mapping."],
        AREA["New Zealand - North Island, South Island, Stewart Island - onshore."],
        BBOX[-47.33,166.37,-34.1,178.63]],
    ID["EPSG",2193]]

and rgdal:

> cat(wkt(CRS("EPSG:2193")), "\n")
PROJCRS["NZGD2000 / New Zealand Transverse Mercator 2000 (with axis order normalized for visualization)",
    BASEGEOGCRS["NZGD2000",
        DATUM["New Zealand Geodetic Datum 2000",
            ELLIPSOID["GRS 1980",6378137,298.257222101,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4167]],
    CONVERSION["New Zealand Transverse Mercator 2000",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",173,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",0.9996,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",1600000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",10000000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]],
        ID["EPSG",19971]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]]] 

with axis order concerns. My suspicion would be that there are more accurate epoch-based datums, but it'll come down to the accuracy you require.

DOSull commented 3 years ago

having become used to how smooth the R-spatial ecosystem had become recently.

Much of what you experience is our adaptation to OSGEO, the development of PROJ 4 - 5 - 6 - 7- 8 over a few years time and our attempts of understanding all that and reacting to it.

I appreciate the challenge this must be - it's pretty strange for a core package to sit essentially unchanged for many years and then suddenly to wake up and for everything to be in rapid flux over such a short period!