r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.31k stars 291 forks source link

Potential issue with st_transform giving wrong output and migration to proj6+ #1634

Closed FrogBoss74 closed 3 years ago

FrogBoss74 commented 3 years ago

Dear Team,

I have been following the development of r-spatial with interest and first of all thank you for your tremendous contribution. The problem I have is using st_transform with EPSG:24047. In this example, I have a point on local EPSG:24047 map being on (661715.11,1520500.03), corresponding to EPSG:4326 being (100.4926613,13.7520388).

EPSG.io is also putting the point at that location. EPSG.io (may need to input (661715.11,1520500.03) manually in the GUI)

st_transform put the point at EPSG:4326 (100.491 13.7525) instead several dozen of meters to the left.

I would appreciate some insights about resolving this. I have millions of transformations to perform and I want to stick to sf package to avoid major code rewrite. Is this a bug or something i am doing wrong? This code worked perfectly before the migration from proj4 ie before I updated sf to latest CRAN version. I also presume EPSG:24047 is not the only crs having a problem.

Herve

library(sf)

> # Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
> 
> # Assuming the following point c(661715.11,1520500.03) is EPSG:24047
> # https://epsg.io/map#srs=24047&x=661715.11&y=1520500.03&z=19&layer=streets
> #
> # ESPG.io puts it correctly (check link above)
> # however, st_transform puts the point to the wrong location

loc = st_sfc(st_point(c(661715.11,1520500.03)), crs=24047)
wrong_loc = st_transform(loc,4326)

> # Geometry set for 1 feature 
> # geometry type:  POINT
> # dimension:      XY
> # bbox:           xmin: 100.491 ymin: 13.7525 xmax: 100.491 ymax: 13.7525
> # geographic CRS: WGS 84
> # POINT (100.491 13.7525)
> # st_transform gives wrong location

library(mapview)
mapview(wrong_loc)

> # The correct location should be
> # https://epsg.io/transform#s_srs=24047&t_srs=4326&x=661715.1100000&y=1520500.0300000
> # 100.4926613
> # 13.7520388
> 

correct_loc = st_sfc(st_point(c(100.4926613,13.7520388)), crs=4326)
mapview(correct_loc)

sessionInfo()

> # R version 4.0.4 (2021-02-15)
> # Platform: x86_64-w64-mingw32/x64 (64-bit)
> # Running under: Windows 10 x64 (build 19042)
> # 
> # Matrix products: default
> # 
> # locale:
> # [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
> # [4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    
> # 
> # attached base packages:
> # [1] stats     graphics  grDevices utils     datasets  methods   base     
> # 
> # other attached packages:
> # [1] mapview_2.9.8 sf_0.9-7     
> # 
> # loaded via a namespace (and not attached):
> # [1] fs_1.5.0                satellite_1.0.2         lubridate_1.7.10        webshot_0.5.2           RColorBrewer_1.1-2     
> # [6] httr_1.4.2              tools_4.0.4             backports_1.2.1         utf8_1.1.4              rgdal_1.5-23           
> # [11] R6_2.5.0                KernSmooth_2.23-18      DBI_1.1.1               lazyeval_0.2.2          colorspace_2.0-0       
> # [16] raster_3.4-5            withr_2.4.1             sp_1.4-5                tidyselect_1.1.0        leaflet_2.0.4.1        
> # [21] compiler_4.0.4          leafem_0.1.3            cli_2.3.1               rvest_1.0.0             xml2_1.3.2             
> # [26] labeling_0.4.2          scales_1.1.1            classInt_0.4-3          readr_1.4.0             proxy_0.4-25           
> # [31] systemfonts_1.0.1       digest_0.6.27           foreign_0.8-81          svglite_2.0.0           base64enc_0.1-3        
> # [36] dichromat_2.0-0         pkgconfig_2.0.3         htmltools_0.5.1.1       dbplyr_2.1.0            htmlwidgets_1.5.3      
> # [41] rlang_0.4.10            readxl_1.3.1            rstudioapi_0.13         farver_2.1.0            generics_0.1.0         
> # [46] jsonlite_1.7.2          crosstalk_1.1.1         dplyr_1.0.5             magrittr_2.0.1          Rcpp_1.0.6             
> # [51] munsell_0.5.0           fansi_0.4.2             abind_1.4-5             lifecycle_1.0.0         stringi_1.5.3          
> # [56] yaml_2.2.1              grid_4.0.4              parallel_4.0.4          forcats_0.5.1           crayon_1.4.1           
> # [61] lattice_0.20-41         haven_2.3.1             stars_0.5-2             hms_1.0.0               leafpop_0.0.6          
> # [66] pillar_1.5.1            uuid_0.1-4              stats4_4.0.4            codetools_0.2-18        reprex_1.0.0           
> # [71] XML_3.99-0.6            glue_1.4.2              leaflet.providers_1.9.0 data.table_1.14.0       modelr_0.1.8           
> # [76] vctrs_0.3.6             png_0.1-7               cellranger_1.1.0        gtable_0.3.0            purrr_0.3.4            
> # [81] tidyr_1.1.3             assertthat_0.2.1        lwgeom_0.2-5            broom_0.7.5             e1071_1.7-6            
> # [86] class_7.3-18            viridisLite_0.3.0       tibble_3.1.0            units_0.7-1             brew_1.0-6             
> # [91] ellipsis_0.3.1  
rsbivand commented 3 years ago

Using a newer PROJ (thanks for reporting the versions of external software!) and current sf 0.9-8, and its new sf_proj_pipelines() function, I get the same difference that you see, but see that:

library(sf)
# Linking to GEOS 3.9.1, GDAL 3.2.2, PROJ 8.0.0
o <- sf_proj_pipelines("EPSG:24047", "OGC:CRS84") # refered to EPSG:4326 because it is easting:northing
o$definition; new in sf 0.9-8
# [1] "+proj=pipeline +step +inv +proj=utm +zone=47 +ellps=evrst30 +step +proj=push +v_3 +step +proj=cart +ellps=evrst30 +step +proj=helmert +x=293 +y=836 +z=318 +rx=0.5 +ry=1.6 +rz=-2.8 +s=2.1 +convention=position_vector +step +inv +proj=cart +ellps=WGS84 +step +proj=pop +v_3 +step +proj=unitconvert +xy_in=rad +xy_out=deg"
# [2] "+proj=pipeline +step +inv +proj=utm +zone=47 +ellps=evrst30 +step +proj=push +v_3 +step +proj=cart +ellps=evrst30 +step +proj=helmert +x=210 +y=814 +z=289 +step +inv +proj=cart +ellps=WGS84 +step +proj=pop +v_3 +step +proj=unitconvert +xy_in=rad +xy_out=deg"                                                            
# [3] "+proj=pipeline +step +inv +proj=utm +zone=47 +ellps=evrst30 +step +proj=unitconvert +xy_in=rad +xy_out=deg"
o$accuracy
# [1]  3  5 NA
(wrong_loc <- st_transform(loc, "OGC:CRS84"))
# ...
POINT (100.491 13.7525)
(wrong_loc1 <- st_transform(loc, "OGC:CRS84", pipeline=o$definition[1]))
# ...
# POINT (100.491 13.7525)
(wrong_loc2 <- st_transform(loc, "OGC:CRS84", pipeline=o$definition[2]))
# ...
# POINT (100.4927 13.75204)

sf queries the PROJ database for instantiable coordinate operations, with here three returned (with or without the PROJ on-demand grid download enabled). They are ordered by reported accuracy as registered in the database. The first, 3m accuracy (7 parameter Helmert), gives your unexpected result, but the 5m accuracy (3 parameter Helmert), is what you expect. epsg.io appears to be running off of the EPSG version 9.8 database and is not a website of EPSG, PROJ 8 has EPSG database version 10.015.

My guess is that, in your workflow, the input point(s) were converted from geographical coordinates to projected using the 3-parameter Helmert (so I think that the projected coordinates are most likely imprecise). Can you independently confirm the GPS/GNSS position of loc (are you in Bangkok) on the ground?

FrogBoss74 commented 3 years ago

Thanks a lot for your answer.

I am using a published georeferenced map and found espg.io points to exactly those reference points. Extract of the map is attached. Actual calibration point I need to use for my project is loc = st_point(c(730500,1403400))%>%st_sfc(crs=24047))%>%st_sf(ref="Geo Reference") The 3rd definition in o$definition gives a better fit it seems, not sure I can explain why. The published map is relatively old. It may well be how the map was defined though such a difference is not expected.

I could not trace the exact configuration of sf I was using in Apr 2020, but that configuration with Proj4 gave a perfect match.

Extract of the georeferenced map is attached below. The reference point (730500,1403400) is spot on the road image

FrogBoss74 commented 3 years ago

correction the 2nd definition gives a better fit

rsbivand commented 3 years ago

epsg.io 24047 gives TOWGS84[210,814,289,0,0,0,0], as part of the Indian 1975 datum definition. The map you refer to was probably based on that, although probably the actual path from geographical coordinates to planar coordinates is not recorded on the map sheet. Some details are given here: https://www.asprs.org/wp-content/uploads/2012/05/02-2011-thailand.pdf. The EPSG transformation codes listed on the epsg.io page (1154, 1304 (default), 1537, and 1812) are those found by reading from proj.db directly:

library(sf)
shpr <- sf_proj_search_paths()
library(RSQLite)
db <- dbConnect(SQLite(), dbname=file.path(shpr[length(shpr)], "proj.db"))
cov <- dbReadTable(db, "coordinate_operation_view")
(I75 <- cov[grep("Indian 1975", cov$name), c("table_name", "code", "name", "method_name", "accuracy")])
helm <- dbReadTable(db, "helmert_transformation_table")
helm[helm$code %in% I75$code, c("auth_name", "code", "name", "tx", "ty", "tz", "rx", "ry", "rz", "scale_difference")]
dbDisconnect(db)

where [1] is 1812, and [2] is 1304 (default for epsg.io). Because mapview uses sf::st_transform() and the most accurate available pipeline found (so [1]), the only viable option here is to retrieve the geographical coordinates from the planar coordinates from the reference map most likely constructed using [2], and re-create them using [1]:

o <- sf_proj_pipelines("EPSG:24047", "OGC:CRS84")
(wrong_loc2 <- st_transform(loc, "OGC:CRS84", pipeline=o$definition[2]))
(right_loc <- st_transform(wrong_loc2, "EPSG:24047", pipeline=o$definition[1]))
library(mapview)
mapviewOptions(fgb = FALSE)
mapview(right_loc)

In this case, there are no grids on https://cdn.proj.org/, so the database lookup gives the best accuracies available. The interesting underlying problem here is that the planar map imposes a specific coordinate operation choice to transform from planar to geographical coordinates, one that may have been best practice when the map was made, but no longer is best practice.

FrogBoss74 commented 3 years ago

It is quite interesting. Surprised to such a difference. The sf_proj_pipelines seems to be a good workaround. Upgraded to sf 0.9-8 and upgraded proj.

This works now with an extra st_transform before each plot with tmap or mapview. The issue is that i need to manually inverse transform the x and y axis labels on map. Maybe such packages should offer an option to directly select the preferred definition as opposed to default. It took me a while to figure out mapview needs fgb option off to display sf objects.

Thanks

rsbivand commented 3 years ago

https://github.com/r-spatial/mapview/issues/321 we still need a way to provide a MIME type for application/flatgeobuf to resolve this, I fear.

tim-salabim commented 3 years ago

It took me a while to figure out mapview needs fgb option off to display sf objects.

@FrogBoss74 you should only need to turn it off if you're using mapview inside of a rmarkdown/notebook document. If it didn't render in plain interactive mode, there's likely something wrong with the current fgb implementation. Please file a bug-report if the latter is the case...

FrogBoss74 commented 3 years ago

Yes, the mapview 2.9.8 issue looks systematic on my configuration. I will report as an issue. It plots sfc points but not sf data where geometry is polygon.