Closed jacpete closed 3 years ago
Essentially what I am seeing here is that getEsriFeaturesByIds() gets called when pulling the json data from the feature/map server. It is currently hardcoded for the EPSG:4326 (line 65 above). However the crs
argument in `esri2sf() doesn't get used until the last call in esri2sf().
https://github.com/yonghah/esri2sf/blob/cbcaf3bd2042905d3b582a927ba79dad7dcbe318/R/esri2sf.R#L34-L65
This is an issue because we are forcing the data to come in as EPSG:4326 but then we allow the data to be defined as any CRS given to the crs
argument in esri2sfGeom().
https://github.com/yonghah/esri2sf/blob/cbcaf3bd2042905d3b582a927ba79dad7dcbe318/R/zzz.R#L119-L134
I can see this leading to some unexpected errors when trying to define a CRS. I believe sf::st_sf() only directly sets a CRS for a layer and does not attempt a transformation.
I also see another (more opinionated) issue with using EPSG:4326 as the default because it means the user of the package needs to be aware that they may be introducing datum transformation shift error into their dataset if the Feature/Map Service has the data stored in a different datum than WGS1984. While this error is often relatively low (the example I often encounter is WGS1984 to NAD1983 with an error of about 2.5 feet) it is unacceptable in some cases. I see two paths to fix this.
crs
argument be NULL
and making a few changes to the esri2sfGeom() and getEsriFeaturesByIds() functions. The downside to this is it may break users current code if they expect the data to come back in EPSG:4326. Once the data is in R they can do the transformation with correct datum shift if needed or they could still specify crs = 4326
in the function explicitly.I am going to work on coding up a pull request for option 2 for a functional example and we can discuss whether or not it should be accepted and/or what the default should be. I guess we could always make the default be crs = 4326
but still add in the crs = NULL
functionality and documentation depending on your preferences.
So I've been working on a pull request for this issue and I am hitting a small snag. For the outSR
parameter in getEsriFeaturesByIds()
ESRI's query format (https://developers.arcgis.com/rest/services-reference/enterprise/query-feature-service-layer-.htm) requires the input to be either a WKTid or spatial reference JSON object that can be formatted to contain an ESRI wkt. In my testing and research, the ESRI wkt is different than the normal OGR WKT generally given by R, but GDAL does have the power to convert to it. What I am looking for now is a way to interface with the GDAL library linked in R through the {sf} package. On linux this is pretty easy because GDAL is compiled and installed separately from the sf install, but on Windows (and I think OSX but I don't run this OS to test) the GDAL library is compiled and installed with the {sf} binaries and it makes it difficult to interface with the GDAL library directly.
Without a way to link to GDAL, we will be limited in our inputs for the crs
argument. We will have to restrict it to only using WKTids These can be either EPSG or ESRI authority WKTids though as the ESRI query format seems to be able to handle either without direct specification. However, sf::st_sf()
does require direct specification so I added a helper (hidden) function getWKTidAuthority
to do a direct query to the proj.db SQLite database in the PROJ library linked to by the {sf} package. This function adds imported dependencies for the {DBI} and {RSQLite} packages which were imported or suggested by {sf} already.
https://github.com/jacpete/esri2sf/blob/62c6e8a3368d1fe9bfe24e400110d5e8606e3fe8/R/zzz.R#L149-L172
To extend the function to be able to accept any crs format you can generally use in R (e.g., proj4string, WKT, numeric EPSG) there will need to be an interface directly to the GDAL gdalsrsinfo
command line tool (I think it may be easiest to create and issue in {sf} to extend sf::gdal_utils()
for this one; https://gdal.org/programs/gdalsrsinfo.html) or improve the {Rccp} interface to exportToWkt
C++ API (https://gdal.org/api/ogrspatialref.html#_CPPv4NK19OGRSpatialReference11exportToWktEPPcPPCKc) with an option to specify the FORMAT parameter and create an exported R function to control it. While I think the implementation of the first option may allow slightly easier access to the functionality, I think the addition of the second may already be being discussed by package authors (see comment on line 115 of code from {sf} below.
https://github.com/r-spatial/sf/blob/405ca1ca4ebdbd5672948aea88251e5d12004740/src/gdal.cpp#L115-L128
Thank you for the detailed comments and the PR. I hope the sf package will address this issue as well.
As of now, the outSR parameter in the
query
list creation is hard coded to be WGS1984 EPSG:4326. I am guessing this is legacy before the crs argument was added esri2sf. I believe this needs updated so that the output CRS can be specified correctly.https://github.com/yonghah/esri2sf/blob/cbcaf3bd2042905d3b582a927ba79dad7dcbe318/R/zzz.R#L62-L73
I am going to fork the repository and do some testing and may upload a pull request.