r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.35k stars 299 forks source link

swapped coordinates resulted from `st_transform` #2263

Closed elgabbas closed 12 months ago

elgabbas commented 1 year ago

Hello,

I found a weird behaviour while using st_transform function on an HPC as compared to windows.

Pts <- tibble::tibble(
    Longitude = c(8.641549, 8.641549, 8.641549, 8.641549, 6.970313, 7.451851), 
    Latitude = c(49.865556, 49.865556, 49.865556, 49.865556, 49.824565, 51.757528)) %>% 
    sf::st_as_sf(coords = c("Longitude", "Latitude"), crs = 4326, remove = FALSE) %>% 
    sf::st_transform(crs = 3035)

This is the results on a Windows PC

Simple feature collection with 6 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 4103036 ymin: 2972497 xmax: 4223327 ymax: 3186094
Projected CRS: ETRS89-extended / LAEA Europe
# A tibble: 6 × 3
  Longitude Latitude          geometry
*     <dbl>    <dbl>       <POINT [m]>
1      8.64     49.9 (4223327 2973462)
2      8.64     49.9 (4223327 2973462)
3      8.64     49.9 (4223327 2973462)
4      8.64     49.9 (4223327 2973462)
5      6.97     49.8 (4103036 2972497)
6      7.45     51.8 (4145096 3186094)

Whereas, this is the result on the Linux HPC [please, note the difference in the bounding box information and new coordinates]

Simple feature collection with 6 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 2972497 ymin: 4103036 xmax: 3186094 ymax: 4223327
Projected CRS: ETRS89-extended / LAEA Europe
# A tibble: 6 × 3
  Longitude Latitude          geometry
*     <dbl>    <dbl>       <POINT [m]>
1      8.64     49.9 (2973462 4223327)
2      8.64     49.9 (2973462 4223327)
3      8.64     49.9 (2973462 4223327)
4      8.64     49.9 (2973462 4223327)
5      6.97     49.8 (2972497 4103036)
6      7.45     51.8 (3186094 4145096)

Here is the result of sf::sf_extSoftVersion()

# windows (R: 4.3.2; sf: 1.0.14)
          GEOS           GDAL         proj.4   GDAL_with_GEOS     USE_PROJ_H           PROJ 
      "3.11.2"        "3.6.2"        "9.2.0"                 "true"         "true"        "9.2.0" 
# linux (R: 3.2.2; sf: 1.0.14)
       GEOS           GDAL         proj.4 GDAL_with_GEOS     USE_PROJ_H      PROJ
      "3.11.1"        "3.6.2"        "9.1.1"         "true"         "true"        "9.1.1"

Is there any explanation for this? Is this related to the versions of GEOS/proj.4/PROJ?

Thanks

elgabbas commented 1 year ago

Here is the result of sessionInfo()

R version 4.2.2 (2022-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /gpfs1/schlecker/global/software/easybuild-broadwell/software/FlexiBLAS/3.2.1-GCC-12.2.0/lib64/libflexiblas.so.3.2

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
 [1] xml2_1.3.5          writexl_1.4.2       vroom_1.6.4
 [4] tidyr_1.3.0         terra_1.7-55        stringr_1.5.0
 [7] snow_0.4-4          sf_1.0-14           rvest_1.0.3
[10] lubridate_1.9.3     rlang_1.1.2         rgbif_3.7.8
[13] readxl_1.4.3        readr_2.1.4         raster_3.6-26
[16] sp_2.1-1            purrr_1.0.2         pbapply_1.7-2
[19] glue_1.6.2          future.apply_1.11.0 furrr_0.3.1
[22] future_1.32.0       dplyr_1.1.3         data.table_1.14.8
[25] crayon_1.5.2        IASDT.R_0.0.3       magrittr_2.0.3

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.11        lattice_0.21-8     listenv_0.9.0      class_7.3-22
 [5] digest_0.6.33      utf8_1.2.4         parallelly_1.36.0  R6_2.5.1
 [9] cellranger_1.1.0   plyr_1.8.9         e1071_1.7-13       httr_1.4.7
[13] ggplot2_3.4.4      pillar_1.9.0       lazyeval_0.2.2     whisker_0.4.1
[17] oai_0.4.0          bit_4.0.5          munsell_0.5.0      proxy_0.4-27
[21] compiler_4.2.2     pkgconfig_2.0.3    globals_0.16.2     tidyselect_1.2.0
[25] tibble_3.2.1       codetools_0.2-19   fansi_1.0.5        withr_2.5.2
[29] tzdb_0.4.0         grid_4.2.2         jsonlite_1.8.7     gtable_0.3.4
[33] lifecycle_1.0.4    DBI_1.1.3          units_0.8-4        scales_1.2.1
[37] KernSmooth_2.23-21 cli_3.6.1          stringi_1.7.12     generics_0.1.3
[41] vctrs_0.6.4        tools_4.2.2        bit64_4.0.5        hms_1.1.3
[45] timechange_0.2.0   colorspace_2.1-0   sessioninfo_1.2.2  classInt_0.4-10
elgabbas commented 1 year ago

It seems that this is not a problem specific to sf! I get a similar result using sp packgae

coordinates(Pts) <- ~ Longitude + Latitude
  crs(Pts) <- "epsg:4326"
  Pts %>% 
    spTransform("epsg:3035") %>% 
    coordinates()
# windows

     coords.x1 coords.x2
[1,]   4223327   2973462
[2,]   4223327   2973462
[3,]   4223327   2973462
[4,]   4223327   2973462
[5,]   4103036   2972497
[6,]   4145096   3186094
# linux

     coords.x1 coords.x2
[1,]   2973462   4223327
[2,]   2973462   4223327
[3,]   2973462   4223327
[4,]   2973462   4223327
[5,]   2972497   4103036
[6,]   3186094   4145096
rsbivand commented 1 year ago

sp 2.1-1 uses sf, so not unusual. What standard have you chosen for axis handling? See https://r-spatial.github.io/sf/reference/st_crs.html, specifically:

st_axis_order returns the (logical) current value if called without argument, or (invisibly) the previous value if it is being set.

elgabbas commented 1 year ago

It returns FALSE on both computers

edzer commented 1 year ago

Strange, I get with the same PROJ/GEOS/GDAL versions

> library(sf)
Linking to GEOS 3.11.1, GDAL 3.6.2, PROJ 9.1.1; sf_use_s2() is TRUE
> Pts <- tibble::tibble(
    Longitude = c(8.641549, 8.641549, 8.641549, 8.641549, 6.970313, 7.451851), 
    Latitude = c(49.865556, 49.865556, 49.865556, 49.865556, 49.824565, 51.757528)) %>% 
    sf::st_as_sf(coords = c("Longitude", "Latitude"), crs = 4326, remove = FALSE) %>% 
    sf::st_transform(crs = 3035)
> Pts
Simple feature collection with 6 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 4103036 ymin: 2972497 xmax: 4223327 ymax: 3186094
Projected CRS: ETRS89-extended / LAEA Europe
# A tibble: 6 x 3
  Longitude Latitude          geometry
*     <dbl>    <dbl>       <POINT [m]>
1      8.64     49.9 (4223327 2973462)
2      8.64     49.9 (4223327 2973462)
3      8.64     49.9 (4223327 2973462)
4      8.64     49.9 (4223327 2973462)
5      6.97     49.8 (4103036 2972497)
6      7.45     51.8 (4145096 3186094)
> sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
[1] C

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] sf_1.0-14

loaded via a namespace (and not attached):
 [1] utf8_1.2.4         e1071_1.7-13       magrittr_2.0.3     glue_1.6.2        
 [5] KernSmooth_2.23-20 tibble_3.2.1       pkgconfig_2.0.3    lifecycle_1.0.4   
 [9] classInt_0.4-10    cli_3.6.1          fansi_1.0.5        vctrs_0.6.4       
[13] grid_4.3.0         DBI_1.1.3          proxy_0.4-27       class_7.3-21      
[17] compiler_4.3.0     tools_4.3.0        pillar_1.9.0       Rcpp_1.0.11       
[21] rlang_1.1.2        units_0.8-4       

Have you tried installing a newer PROJ version on the HPC?

elgabbas commented 1 year ago

Thanks a lot for your feedback..

PROJ was installed by the HPC administration.

That is really strange!!! If I run the script above in a separate SLURM job, I get the expected results. However, if I load a raster object (using raster::raster() or terra::rast()) before running this script (in another SLURM job), the coordinates are swapped!!

elgabbas commented 1 year ago

This is a screenshot of the results. image image

Interestingly, when transforming the data back into epsg:4326 the coordinates are correct [which confirms that the st_transform function swaps the coordinates] image


The problem also exists when using terra::project on a SpatVector object

Pts %>% 
  as.matrix() %>% 
  terra::vect(crs = "epsg:4236") %>% 
  terra::project("epsg:3035") %>% 
  terra::crds()

image


This does not happen after reading any *.tif file. I tried loading a variety of tif files before using st_transform and the problem occurred ONLY after loading a specific *.tif file (Copernicus CORINE land cover data; either using raster::raster() or terra::rast()) first, then using st_transform.

If st_transform was used first (after loading the packages), the results are as expected (i.e., there are no weird results after using raster::raster() or terra::rast()). However, if either raster::raster() or terra::rast() is used first with this file, the problem occurr.

Is there a way to find out the reason for this behavior; i.e., is it possible to compare something before and after loading the tif file? [I can share the tif file with you, when necessary]

Thanks

edzer commented 1 year ago

Please share such a .tif file and provide a complete script.

elgabbas commented 1 year ago

Please download the .tif file (along with other files) from this link (https://nc.ufz.de/s/cFbLtCkpSoSkRC5 --- password: St_transform_1)

Here is the complete script:

require(sf)
require(dplyr)
require(raster)
require(terra)

RR <- raster::raster("DATA/U2018_CLC2018_V2020_20u1.tif")

# data
Pts <- tibble::tibble(
  Longitude = c(8.641549, 8.641549, 8.641549, 8.641549, 6.970313, 7.451851), 
  Latitude = c(49.865556, 49.865556, 49.865556, 49.865556, 49.824565, 51.757528))

# using sf
Pts %>% 
  sf::st_as_sf(coords = c("Longitude", "Latitude"), crs = 4326, remove = FALSE) %>% 
  sf::st_transform(crs = 3035)
# Simple feature collection with 6 features and 2 fields
# Geometry type: POINT
# Dimension:     XY
# Bounding box:  xmin: 2972497 ymin: 4103036 xmax: 3186094 ymax: 4223327
# Projected CRS: ETRS89-extended / LAEA Europe
# # A tibble: 6 × 3
# Longitude Latitude          geometry
# *     <dbl>    <dbl>       <POINT [m]>
#   1      8.64     49.9 (2973462 4223327)
# 2      8.64     49.9 (2973462 4223327)
# 3      8.64     49.9 (2973462 4223327)
# 4      8.64     49.9 (2973462 4223327)
# 5      6.97     49.8 (2972497 4103036)
# 6      7.45     51.8 (3186094 4145096)

Pts %>% 
  sf::st_as_sf(coords = c("Longitude", "Latitude"), crs = 4326, remove = FALSE) %>% 
  sf::st_transform(crs = 3035) %>% 
  sf::st_transform(crs = 4326)

# Simple feature collection with 6 features and 2 fields
# Geometry type: POINT
# Dimension:     XY
# Bounding box:  xmin: 6.970313 ymin: 49.82456 xmax: 8.641549 ymax: 51.75753
# Geodetic CRS:  WGS 84
# # A tibble: 6 × 3
# Longitude Latitude            geometry
# *     <dbl>    <dbl>         <POINT [°]>
#   1      8.64     49.9 (8.641549 49.86556)
# 2      8.64     49.9 (8.641549 49.86556)
# 3      8.64     49.9 (8.641549 49.86556)
# 4      8.64     49.9 (8.641549 49.86556)
# 5      6.97     49.8 (6.970313 49.82456)
# 6      7.45     51.8 (7.451851 51.75753)

# using terra
Pts %>% 
  as.matrix() %>% 
  terra::vect(crs = "epsg:4236") %>% 
  terra::project("epsg:3035") %>% 
  terra::crds()

# x       y
# [1,] 2973462 4223327
# [2,] 2973462 4223327
# [3,] 2973462 4223327
# [4,] 2973462 4223327
# [5,] 2972497 4103036
# [6,] 3186094 4145096

Thanks

edzer commented 1 year ago

I cannot reproduce your swapped coordinates, also not in the container with identical GEOS/PROJ/GDAL versions. Is the environment variable PROJ_LIB set on your HPC system? What does st_crs(3035) print on your HPC system?

elgabbas commented 1 year ago

Thanks for your feedback... This is really strange.

Sys.getenv("PROJ_LIB")
[1] ""
st_crs(3035)
Coordinate Reference System:
  User input: EPSG:3035
  wkt:
PROJCRS["ETRS89-extended / LAEA Europe",
    BASEGEOGCRS["ETRS89",
        ENSEMBLE["European Terrestrial Reference System 1989 ensemble",
            MEMBER["European Terrestrial Reference Frame 1989"],
            MEMBER["European Terrestrial Reference Frame 1990"],
            MEMBER["European Terrestrial Reference Frame 1991"],
            MEMBER["European Terrestrial Reference Frame 1992"],
            MEMBER["European Terrestrial Reference Frame 1993"],
            MEMBER["European Terrestrial Reference Frame 1994"],
            MEMBER["European Terrestrial Reference Frame 1996"],
            MEMBER["European Terrestrial Reference Frame 1997"],
            MEMBER["European Terrestrial Reference Frame 2000"],
            MEMBER["European Terrestrial Reference Frame 2005"],
            MEMBER["European Terrestrial Reference Frame 2014"],
            ELLIPSOID["GRS 1980",6378137,298.257222101,
                LENGTHUNIT["metre",1]],
            ENSEMBLEACCURACY[0.1]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4258]],
    CONVERSION["Europe Equal Area 2001",
        METHOD["Lambert Azimuthal Equal Area",
            ID["EPSG",9820]],
        PARAMETER["Latitude of natural origin",52,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",10,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["False easting",4321000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",3210000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (Y)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (X)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Statistical analysis."],
        AREA["Europe - European Union (EU) countries and candidates. Europe - onshore and offshore: Albania; Andorra; Austria; Belgium; Bosnia and Herzegovina; Bulgaria; Croatia; Cyprus; Czechia; Denmark; Estonia; Faroe Islands; Finland; France; Germany; Gibraltar; Greece; Hungary; Iceland; Ireland; Italy; Kosovo; Latvia; Liechtenstein; Lithuania; Luxembourg; Malta; Monaco; Montenegro; Netherlands; North Macedonia; Norway including Svalbard and Jan Mayen; Poland; Portugal including Madeira and Azores; Romania; San Marino; Serbia; Slovakia; Slovenia; Spain including Canary Islands; Sweden; Switzerland; Türkiye (Turkey); United Kingdom (UK) including Channel Islands and Isle of Man; Vatican City State."],
        BBOX[24.6,-35.58,84.73,44.83]],
    ID["EPSG",3035]]
elgabbas commented 1 year ago

I get the same wrong results when using stars::read_stars before st_transform. This seems not to be a bug directly related to R packages like sf / raster / terra / etc.


I tried the same script on another version of R (R/4.0.4-2) and PROJ on the same HPC and the results were as expected.

 sf::sf_extSoftVersion()
          GEOS           GDAL         proj.4 GDAL_with_GEOS     USE_PROJ_H   PROJ
       "3.9.1"        "3.2.1"        "7.2.1"         "true"         "true"       "7.2.1"

This is strange. Probably, this is a problem related to the configuration of the newer version of R or PROJ (or something else). I will see if PROJ can be updated on the HPC.

Thanks for your help.