r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.35k stars 300 forks source link

EPSG code gives wrong parameters #630

Closed davidorme closed 6 years ago

davidorme commented 6 years ago

Hi,

I not even sure if this is an sf bug, since it leads back deep into src/ where I fear to tread but:

> st_crs(29873)
Coordinate Reference System:
  EPSG: 29873 
  proj4string: "+proj=omerc +lat_0=4 +lonc=115 +alpha=53.31582047222222 +k=0.99984 
                +x_0=590476.87 +y_0=442857.65 +gamma=53.31582047222222 +ellps=evrstSS 
                +towgs84=-679,669,-48,0,0,0,0 +units=m +no_defs"

The alpha (azimuth) value is repeated for gamma (rectified_grid_angle), and should be 53.13010236111111 ( https://epsg.io/29873-1228). The resulting error at my site is about 750 metres.

This is a really obscure error in a projection that I really wish I didn't have to use, but I'm reporting it just in case this is parameter processing issue rather than just a typo in the database.

Oddly, the shapefile format makes matters worse because that gamma parameter doesn't make it into the .prj file at all and the errors go up to about 1 kilometre.

Cheers, David

davidorme commented 6 years ago

No, there is something odd going on within sf. If I set the crs using the proj4 string, then the parameters get altered: alpha and gamma aren't the same in the input string but are in the resulting printed description.

> timbalai <- '+proj=omerc +lat_0=4 +lonc=115 +alpha=53.31582047222222 +k=0.99984 
  +x_0=590476.87 +y_0=442857.65 +gamma=53.13010236111111 +ellps=evrstSS 
  +towgs84=-679,669,-48,0,0,0,0 +units=m +no_defs '
> st_crs(timbalai)
Coordinate Reference System:
  No EPSG code
  proj4string: "+proj=omerc +lat_0=4 +lonc=115 +alpha=53.31582047222222 +k=0.99984 
  +x_0=590476.87 +y_0=442857.65 +gamma=53.31582047222222 +ellps=evrstSS 
  +towgs84=-679,669,-48,0,0,0,0 +units=m +no_defs"
> 
edzer commented 6 years ago

This is st_crs() which lets GDAL do modifications. You may want to try lwgeom::st_transform_proj which circumvents going through GDAL and takes raw proj4 strings as input. Let me know if it works.

davidorme commented 6 years ago

I think the problem with that is that I can't set the correct CRS on the object to be transformed.

> timbalai <- '+proj=omerc +lat_0=4 +lonc=115 +alpha=53.31582047222222 +k=0.99984 
  +x_0=590476.87 +y_0=442857.65 +gamma=53.13010236111111 +ellps=evrstSS 
  +towgs84=-679,669,-48,0,0,0,0 +units=m +no_defs '
> dat_sf <- st_as_sf(dat, coords=c('X.coordinate..m.', 'Y.coordinate..m.'), crs=timbalai)
> dat_sf
Simple feature collection with 2688 features and 1 field
geometry type:  POINT
dimension:      XY
bbox:           xmin: 805856.4 ymin: 515360.5 xmax: 877613.4 ymax: 527290.5
epsg (SRID):    NA
proj4string:    +proj=omerc +lat_0=4 +lonc=115 +alpha=53.31582047222222 +k=0.99984 
  +x_0=590476.87 +y_0=442857.65 +gamma=53.31582047222222 +ellps=evrstSS 
  +towgs84=-679,669,-48,0,0,0,0 +units=m +no_defs
First 10 features:
                               Unique.Trap.ID                  geometry
D Trap Covariates 2013.xlsx.1       D100-1-1A POINT (876922.7 523050.7)
D Trap Covariates 2013.xlsx.2       D100-1-1B POINT (876914.7 523055.8)
davidorme commented 6 years ago

I haven't tested it but that looks like it should provide a workaround, thanks.

Is there any way to predict whether GDAL is going to modify a particular projection? What is the reason for the modification - in this case, it seems like an outright bug.

edzer commented 6 years ago

It is easy to test whether it changed syntactically:

library(sf)
# Linking to GEOS 3.5.1, GDAL 2.2.1, proj.4 4.9.3
p4s = "+proj=longlat"
# the test:
test_crs = function(x) x == st_crs(x)$proj4string
# try it:
test_crs(p4s)
# [1] FALSE
test_crs("+proj=longlat +ellps=WGS84 +no_defs")
# [1] TRUE

but much harder to establish whether the CRS' semantics changed.

davidorme commented 6 years ago

Yes, it surely is. The option I could think of would be to compare the local proj4string against an external source (as in the function below) and, as you say, it is really hard to know if differences are semantically meaningful. Different values for the same parameter name in the proj4string must be bad, though?

Commit https://github.com/r-spatial/lwgeom/commit/2154e5980947a7c574568327bbbf9a6debd5aac2 gives a mechanism to work around the problem once you've spotted it but it is quite hard for a user to detect that it is needed. In my case, it was only because I was trying to drag some obscure data in that I knew needed to align with existing data in WGS84 and didn't.

Is this something that is likely to resolve in future versions of GDAL? Otherwise, does it become necessary to screen codes using something like the attached to provide a hitlist of codes that get semantic differences. I know that is a horrific can of worms but coordinate shifts are definitely a bad thing!

crs_compare_epsgio <- function(epsg){

    require(RCurl)
    local <- st_crs(epsg)
    remote <- getURL(sprintf('https://epsg.io/%s.proj4?download', epsg))

    # function to coerce the string into a list of parameter names and values
    listify_p4s <- function(p4s){
        # split the string into parameter name +- value 
        p4s_parts <- strsplit(p4s, '\\s+')[[1]]
        p4s_parts <- strsplit(p4s_parts, '=')
        # get parameter names
        p4s_names <- sapply(p4s_parts, '[', 1)
        # inserts NULL values for valueless parameters (e.g. +no_defs)
        p4s_lens <- lengths(p4s_parts)
        p4s_vals <- mapply(function(L, V) if(L == 1) NULL else V[2], p4s_lens, p4s_parts)
        names(p4s_vals) <- p4s_names
        return(p4s_vals)        
    }

    # extract the components of the projection and compare  
    local <- listify_p4s(local$proj4string)
    remote <- listify_p4s(remote)

    params <- union(names(local), names(remote))
    differences <- list()
    for(p in params){
        if(!p %in% names(local)){
            differences[[p]] <- 'Missing from local'
        } else if(!p %in% names(remote)) {
            differences[[p]] <- 'Missing from remote'
        } else if(is.null(local[[p]]) & ! is.null(remote[[p]])){
            differences[[p]] <- 'No local value'
        } else if(! is.null(local[[p]]) & is.null(remote[[p]])){
            differences[[p]] <- 'No remote value'
        } else if(! is.null(local[[p]]) && 
                  ! is.null(remote[[p]]) && 
                  local[[p]] != remote[[p]]){
            differences[[p]] <- 'Differing values'
        }
    }
    return(differences)
}
crs_compare_epsgio(29873)
# $`+gamma`
# [1] "Differing values"
crs_compare_epsgio(32650)
# list()
edzer commented 6 years ago

Interesting! However, you might be seeing differences of EPSG versions here: mapping agencies might over time improve the parameters of their national CRS, e.g. one I'm familiar with is

> crs_compare_epsgio(28992)
$`+towgs84`
[1] "Differing values"

which points (I hope) to improved matching of the national ellipsoid to WGS84. It is clearly wrong to assume EPSG code -> proj4string mappings do not change over time.

rsbivand commented 6 years ago

Where does GDAL keep its epsg file now? The 2.2.3 epsg.wkt just points to two other files. By the way, the PROJ.4 5.0.0 epsg file now has a single first line of metadata giving the EPSG version (and breaking rgdal::make_EPSG()).

davidorme commented 6 years ago

Of course. Or not even versions: differing localisations might give a mismatch. So for my case, it depends on whether you want 1.0m accuracy for Brunei (https://epsg.io/29873-5249) or 5.0m accuracy for Sabah and Sarawak (https://epsg.io/29873-1852).

lb-dormelap:Downloads dorme$ diff 29873-5249.prj 29873-1852.prj 
6c6
<             TOWGS84[-533.4,669.2,-52.5,0,0,4.28,9.4],
---
>             TOWGS84[-689.5937,623.84046,-65.93566,0.02331,-1.17094,0.80054,5.88536],
rsbivand commented 6 years ago

data/pcs.csv is where the GDAL EPSG listings live, linking to other lists in the same directory, including gcs.csv for datum transformations.

edzer commented 6 years ago

Can we close this one?

davidorme commented 6 years ago

Yup, although it bothers me if GDAL is making hard to detect changes in projection parameters. I can see there isn't any way to detect it within sf that won't have a huge false positive rate.

I'd like to raise it as a bug in GDAL but I'm not sure how to demonstrate it using their API - I've tried using the python bindings to import to a GDAL SpatialReference from that proj4 string and I don't see any mangling, so I'm not sure what is going on:

In [1]: from osgeo import ogr
   ...: from osgeo import osr
In [2]: source = osr.SpatialReference()
In [3]: source.ImportFromProj4('''+proj=omerc +lat_0=4 +lonc=115 +alpha=53.31582047222222 +k=0.99984 
   ...:   +x_0=590476.87 +y_0=442857.65 +gamma=53.13010236111111 +ellps=evrstSS 
   ...:   +towgs84=-679,669,-48,0,0,0,0 +units=m +no_defs ''')
Out[3]: 0
In [4]: print source.ExportToPrettyWkt()
PROJCS["unnamed",
    GEOGCS["Everest (Sabah & Sarawak)",
        DATUM["unknown",
            SPHEROID["evrstSS",6377298.556,300.8017],
            TOWGS84[-679,669,-48,0,0,0,0]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433]],
    PROJECTION["Hotine_Oblique_Mercator_Azimuth_Center"],
    PARAMETER["latitude_of_center",4],
    PARAMETER["longitude_of_center",115],
    PARAMETER["azimuth",53.31582047222222],
    PARAMETER["rectified_grid_angle",53.13010236111111],
    PARAMETER["scale_factor",0.99984],
    PARAMETER["false_easting",590476.87],
    PARAMETER["false_northing",442857.65],
    UNIT["Meter",1]]