r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.3k stars 291 forks source link

Version 1.0.4 - geometry objects are not valid #1821

Closed RamiKrispin closed 2 years ago

RamiKrispin commented 2 years ago

Describe the bug Change in the output of the sf object after moving to version 1.0.4 from version 0.9.5. Some of the geometric object classified as an invalid objects

To Reproduce

# With version 0.9.5
remotes::install_version(package = "sf", version = "0.9.5")
library(rnaturalearth)
library(sf)
library(tmap)
library(dplyr)
library(coronavirus)

data("covid19_vaccine")

map <- ne_countries(returnclass = "sf") %>%
  dplyr::select(name, iso2 = iso_a2, iso3 = iso_a3, geometry)

head(map)
df <- map %>% left_join(
  covid19_vaccine %>%
    filter(date == max(date),
           is.na(province_state)) %>%
    mutate(perc = round(100 * people_fully_vaccinated / population, 2)) %>%
    select(country_region, iso2, iso3, people_fully_vaccinated, perc, continent_name),
  by = c("iso2", "iso3")
) %>% 
  filter(!name %in% c("Greenland", "Antarctica"))

Checking the if all the geometry objects are valid:

table(sf::st_is_valid(df))
TRUE 
 175 

Repeat the process with version 1.0.4

install.packages("sf")

library(rnaturalearth)
library(sf)
library(tmap)
library(dplyr)
library(coronavirus)

data("covid19_vaccine")

map <- ne_countries(returnclass = "sf") %>%
  dplyr::select(name, iso2 = iso_a2, iso3 = iso_a3, geometry)

head(map)
df <- map %>% left_join(
  covid19_vaccine %>%
    filter(date == max(date),
           is.na(province_state)) %>%
    mutate(perc = round(100 * people_fully_vaccinated / population, 2)) %>%
    select(country_region, iso2, iso3, people_fully_vaccinated, perc, continent_name),
  by = c("iso2", "iso3")
) %>% 
  filter(!name %in% c("Greenland", "Antarctica"))

With version 1.0.4 the geometry objects of Russia and Fiji are invalid

table(sf::st_is_valid(df))

FALSE  TRUE 
    2   173 

Additional context This issue cause the following error:

library(tmap)
> tm_shape(df) + 
      tm_polygons(col = "perc",  
                  n = 8,
                  title = "Fully Vaccinated %",
                  palette = "Blues")
Error in st_transform_proj.sfc(st_geometry(x), crs, ...) : 
  is.character(crs) is not TRUE
In addition: Warning message:
The shape df is invalid. See sf::st_is_valid 
Paste the output of your `sessionInfo()` and `sf::sf_extSoftVersion()` ``` r > sf::sf_extSoftVersion() GEOS GDAL proj.4 GDAL_with_GEOS USE_PROJ_H PROJ "3.7.2" "2.4.2" "5.2.0" "false" "false" "5.2.0" > sessionInfo() R version 3.6.0 (2019-04-26) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS 10.15.7 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] coronavirus_0.3.31 dplyr_1.0.7 tmap_2.3-2 sf_1.0-3 [5] rnaturalearth_0.1.0 loaded via a namespace (and not attached): [1] Rcpp_1.0.7 pillar_1.6.2 compiler_3.6.0 RColorBrewer_1.1-2 [5] class_7.3-15 remotes_2.2.0 tools_3.6.0 digest_0.6.28 [9] viridisLite_0.3.0 lifecycle_1.0.1 tibble_3.1.4 lattice_0.20-38 [13] pkgconfig_2.0.3 rlang_0.4.11 DBI_1.1.0 rstudioapi_0.11 [17] crosstalk_1.1.0.1 parallel_3.6.0 yaml_2.2.1 e1071_1.7-3 [21] s2_1.0.7 raster_2.9-5 rgeos_0.5-3 generics_0.0.2 [25] vctrs_0.3.8 htmlwidgets_1.5.3 classInt_0.4-3 leaflet_2.0.2 [29] grid_3.6.0 tidyselect_1.1.1 glue_1.4.2 R6_2.4.1 [33] fansi_0.4.1 XML_3.98-1.20 sp_1.4-1 purrr_0.3.3 [37] magrittr_1.5 codetools_0.2-16 stars_0.5-3 tmaptools_3.1-1 [41] ellipsis_0.3.2 htmltools_0.5.1.1 leafsync_0.1.0 units_0.6-6 [45] abind_1.4-5 dichromat_2.0-0 assertthat_0.2.1 utf8_1.1.4 [49] KernSmooth_2.23-15 wk_0.5.0 lwgeom_0.2-1 crayon_1.3.4 ```
rsbivand commented 2 years ago
> st_is_longlat(df)
[1] TRUE
> sf_use_s2()
[1] TRUE
> table(sf::st_is_valid(df))

FALSE  TRUE 
    3   179 
> sf_use_s2(FALSE)
Spherical geometry (s2) switched off
> table(sf::st_is_valid(df))

TRUE 
 182 

Please do read https://cran.r-project.org/web/packages/sf/news/news.html, noting for 1.0-0

use s2 spherical geometry as default when coordinates are ellipsoidal

jlacko commented 2 years ago

This is more of a problem with the {rnaturalearth} package, which is where the flawed dataset is coming from, than either {sf} or {s2}, which behave in line with their documentation.

Note how the issue seems to involve only countries that straddle the antimeridian; this will give you some idea about its root cause.

As a workaround I suggest using a different version of the world dataset; there are several readily available to choose from. My favorite lives in the {giscoR} package, which interfaces to shapefiles provided by the good folks of the Eurostat.

library(rnaturalearth)
library(sf)
library(dplyr)
library(giscoR)

map <- ne_countries(returnclass = "sf") %>%
    dplyr::select(name, iso2 = iso_a2, iso3 = iso_a3, geometry)

map$name[!st_is_valid(map)]
[1] "Antarctica" "Fiji"       "Russia"    

map2 <- giscoR::gisco_get_countries() %>% 
    dplyr::select(name = NAME_ENGL, iso2 = CNTR_ID, iso3 = ISO3_CODE)

map2$name[!st_is_valid(map2)]
character(0) # i.e. nothing is invalid = all countries are valid
edzer commented 2 years ago

Eurostat cheated:

> st_bbox(map2[map2$name == "Fiji",])
      xmin       ymin       xmax       ymax 
-180.00000  -20.70876  179.99999  -12.46203 
RamiKrispin commented 2 years ago

thank you all for the solution and explanation!