mapme-initiative / mapme.biodiversity

Efficient analysis of spatial biodiversity datasets for global portfolios
https://mapme-initiative.github.io/mapme.biodiversity/dev
GNU General Public License v3.0
33 stars 7 forks source link

get_* fails for some ressources with overlapping polygons for worldpop and nelson_et_al #335

Closed fBedecarrats closed 3 months ago

fBedecarrats commented 3 months ago

Sometimes we need to compute indicators for overlapping polygons (eg. buffers around villages or PAs). In case of a portfolio containing overlapping polygons, get_nelson_et_al() and get_worldpop(). It works however for others I tried: get_nasa_srtm and getgfw* . I think that this might be related to https://github.com/mapme-initiative/mapme.biodiversity/issues/319 Here is a reprex :

library(tidygeocoder)
library(sf)
library(tidyverse)
library(mapme.biodiversity)
library(geodata)

# Create a dataframe with the cities 
locations <- data.frame(city = c("Frankfurt am Main", "Darmstadt"),
                        country = "Germany")

# Geocode the cities to get latitude and longitude
locations <- locations %>%
  geocode(city = city, country = country, method = 'osm')

# Convert to sf object
locations_sf <- st_as_sf(locations, coords = c("long", "lat"), crs = 4326)

# Create 20km buffers around each city. The buffers overlap.
buffers <- st_buffer(locations_sf, dist = 20000)

# Store data in "data" folder where we are sure to have writing rights
mapme_options(outdir = "data")

# Attempt to download travel time and population density resources
# These fails, I guess due to overlapping polygons
buffers <- get_resources(buffers, get_nelson_et_al(ranges = "20k_110mio"))

buffers <- get_resources(buffers, get_worldpop(years = 2000))

Throws the following error:

Error in sf::gdal_utils(util = util, source = source, destination = destination,  : 
  gdal_utils translate: an error occured
In addition: Warning message:
In check_namespace("progressr", error = FALSE) :
  R package 'progressr' required.
Please install via `install.packages('progressr')`FALSE
Error in sf::gdal_utils(util = util, source = source, destination = destination,  : 
  gdal_utils translate: an error occured
Error in sf::gdal_utils(util = util, source = source, destination = destination,  : 
  gdal_utils translate: an error occured
Warning messages:
1: In CPL_gdaltranslate(source, destination, options, oo, config_options,  :
  GDAL Error 4: Attempt to create new tiff file `data/nelson_et_al/traveltime-20k_110mio.tif' failed: No such file or directory
2: In CPL_gdaltranslate(source, destination, options, oo, config_options,  :
  GDAL Error 4: Attempt to create new tiff file `data/nelson_et_al/traveltime-20k_110mio.tif' failed: No such file or directory
3: In CPL_gdaltranslate(source, destination, options, oo, config_options,  :
  GDAL Error 4: Attempt to create new tiff file `data/nelson_et_al/traveltime-20k_110mio.tif' failed: No such file or directory
buffers <- get_resources(buffers, get_worldpop(years = 2000))

Throws the following error:

Found a column named 'assetid'. Overwritting its values with a unique identifier.
trying URL 'https://data.worldpop.org/GIS/Population/Global_2000_2020/2000/0_Mosaicked/ppp_2000_1km_Aggregated.tif'
Content type 'image/tiff' length 1165526326 bytes (1111.5 MB)
==================================================
downloaded 1111.5 MB

Error in map2(.x, .y, .f, ..., .progress = .progress) : 
  ℹ In index: 1.
Caused by error in `sf::gdal_utils()`:
! gdal_utils translate: an error occured
In addition: Warning message:
In CPL_gdaltranslate(source, destination, options, oo, config_options,  :
  GDAL Error 4: Attempt to create new tiff file `data/worldpop/ppp_2000_1km_Aggregated.tif' failed: No such file or directory
Error in .check_footprints(resource, resource_name) : 
  Download for resource worldpop failed.
Returning unmodified portfolio.
# Get GADM polygon for Hessen (includes Frankfurt and Darmstatt)
hessen <- gadm("Germany", level = 1, path = "data") %>%
  st_as_sf() %>%
  filter(NAME_1 == "Hessen")

# Attempt to download travel time and population density resources for Hesse
hessen <- get_resources(hessen, get_nelson_et_al(ranges = "20k_110mio"))
hessen <- get_resources(hessen, get_worldpop(years = 2000))
# It works !

# The resources we got for Hessen work to compute indicators for Frankfurt and Darmstadt
buffers <- calc_indicators(buffers,
                           calc_population_count(),
                           calc_traveltime())
# It works !
goergen95 commented 3 months ago

There are several things going on here:

As of recently (#308) the source server of the nelson_et_al resource seems to be somewhat unreliable which is why we now recommend to set respective GDAL options so that the download does not fail:

https://github.com/mapme-initiative/mapme.biodiversity/blob/9672d64fe791194913675aca0cfe3f6cd56e0517/R/get_nelson_et_al.R#L23-L26

For worldpop, we also recommend increasing the timeout options so the data can be download successfully:

https://github.com/mapme-initiative/mapme.biodiversity/blob/9672d64fe791194913675aca0cfe3f6cd56e0517/R/get_worldpop.R#L10-L12

Both of these resources represent global raster files. Their respective source code thus does not interact with the geometries you provide as the portfolio, meaning that it does not matter if the geometries overlap or not.

Given the mentioned error message:

1: In CPL_gdaltranslate(source, destination, options, oo, config_options, : GDAL Error 4: Attempt to create new tiff file `data/nelson_et_al/traveltime-20k_110mio.tif' failed: No such file or directory

I suspect that the directory data did not exists when you run the first part of your code example. It was only created once you ran

# Get GADM polygon for Hessen (includes Frankfurt and Darmstatt)
hessen <- gadm("Germany", level = 1, path = "data") %>%
  st_as_sf() %>%
  filter(NAME_1 == "Hessen")

with the gadm() call creating the data directory on disk to put its respective files there. Since that directory now existed, your subsequent get_resource() calls ran successfully.

Why don't we check if outdir exists in mapme.biodiversity?

Because the value of outdir can as well point towards a remote cloud storage, so we cannot run dir.exist() or dir.create() against it and expect meaningful outcomes.

Given these considerations, the below code example (sorry to say it) "works on my machine". Could you please run it and post your output here?

reprex::reprex({
  library(sf)
  library(mapme.biodiversity)

  locations <- data.frame(
    long = c(8.682127,  8.646927),
    lat = c(50.110924, 49.878708),
    city = c("Frankfurt", "Darmstadt")
  )

  locations_sf <- st_as_sf(locations, coords = c("long", "lat"), crs = st_crs(4326))
  buffers <- st_buffer(locations_sf, dist = 20000)
  st_overlaps(buffers)

  # create output directory
  outdir <- tempfile()
  dir.create(outdir)
  dir.exists(outdir)

  # set timeout and retry options as per documentation
  options(timeout = 600)
  Sys.setenv(GDAL_HTTP_MAX_RETRY = "5",
             GDAL_HTTP_RETRY_DELAY = "15")

  mapme_options(outdir = outdir)
  get_resources(buffers,
                get_nelson_et_al(ranges = "20k_110mio"),
                get_worldpop(years = 2000))

  list.files(outdir, recursive = TRUE)

}, session_info = TRUE)
reprex output ``` r library(sf) #> Linking to GEOS 3.12.2, GDAL 3.9.1, PROJ 9.4.1; sf_use_s2() is TRUE library(mapme.biodiversity) locations <- data.frame( long = c(8.682127, 8.646927), lat = c(50.110924, 49.878708), city = c("Frankfurt", "Darmstadt") ) locations_sf <- st_as_sf(locations, coords = c("long", "lat"), crs = st_crs(4326)) buffers <- st_buffer(locations_sf, dist = 20000) st_overlaps(buffers) #> although coordinates are longitude/latitude, st_overlaps assumes that they are #> planar #> Sparse geometry binary predicate list of length 2, where the predicate #> was `overlaps' #> 1: 2 #> 2: 1 # create output directory outdir <- tempfile() dir.create(outdir) dir.exists(outdir) #> [1] TRUE # set timeout and retry options as per documentation options(timeout = 600) Sys.setenv(GDAL_HTTP_MAX_RETRY = "5", GDAL_HTTP_RETRY_DELAY = "15") mapme_options(outdir = outdir) get_resources(buffers, get_nelson_et_al(ranges = "20k_110mio"), get_worldpop(years = 2000)) #> Warning in CPL_gdaltranslate(source, destination, options, oo, config_options, #> : GDAL Message 1: HTTP error code: 429 - #> https://figshare.com/ndownloader/files/14189843. Retrying again in 15.0 secs #> Warning in CPL_gdaltranslate(source, destination, options, oo, config_options, #> : GDAL Message 1: HTTP error code: 429 - #> https://figshare.com/ndownloader/files/14189843. Retrying again in 33.0 secs #> Warning in CPL_gdaltranslate(source, destination, options, oo, config_options, #> : GDAL Message 1: HTTP error code: 429 - #> https://figshare.com/ndownloader/files/14189843. Retrying again in 81.8 secs list.files(outdir, recursive = TRUE) #> [1] "nelson_et_al/traveltime-20k_110mio.tif" #> [2] "worldpop/ppp_2000_1km_Aggregated.tif" ``` Created on 2024-08-17 with [reprex v2.1.0](https://reprex.tidyverse.org)
Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.4.1 (2024-06-14) #> os Ubuntu 22.04.4 LTS #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Etc/UTC #> date 2024-08-17 #> pandoc 3.2 @ /usr/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> class 7.3-22 2023-05-03 [2] CRAN (R 4.4.1) #> classInt 0.4-10 2023-09-05 [1] CRAN (R 4.4.1) #> cli 3.6.3 2024-06-21 [1] RSPM (R 4.4.0) #> codetools 0.2-20 2024-03-31 [2] CRAN (R 4.4.1) #> curl 5.2.1 2024-03-01 [1] RSPM (R 4.4.0) #> DBI 1.2.3 2024-06-02 [1] RSPM (R 4.4.0) #> digest 0.6.36 2024-06-23 [1] RSPM (R 4.4.0) #> dplyr 1.1.4 2023-11-17 [1] RSPM (R 4.4.0) #> e1071 1.7-14 2023-12-06 [1] CRAN (R 4.4.1) #> evaluate 0.24.0 2024-06-10 [1] RSPM (R 4.4.0) #> fansi 1.0.6 2023-12-08 [1] RSPM (R 4.4.0) #> fastmap 1.2.0 2024-05-15 [1] RSPM (R 4.4.0) #> fs 1.6.4 2024-04-25 [1] RSPM (R 4.4.0) #> furrr 0.3.1 2022-08-15 [1] RSPM (R 4.4.0) #> future 1.34.0 2024-07-29 [1] CRAN (R 4.4.1) #> generics 0.1.3 2022-07-05 [1] RSPM (R 4.4.0) #> globals 0.16.3 2024-03-08 [1] CRAN (R 4.4.1) #> glue 1.7.0 2024-01-09 [1] RSPM (R 4.4.0) #> htmltools 0.5.8.1 2024-04-04 [1] RSPM (R 4.4.0) #> httr2 1.0.2 2024-07-16 [1] RSPM (R 4.4.0) #> jsonlite 1.8.8 2023-12-04 [1] RSPM (R 4.4.0) #> KernSmooth 2.23-24 2024-05-17 [2] CRAN (R 4.4.1) #> knitr 1.48 2024-07-07 [1] RSPM (R 4.4.0) #> lifecycle 1.0.4 2023-11-07 [1] RSPM (R 4.4.0) #> listenv 0.9.1 2024-01-29 [1] CRAN (R 4.4.1) #> magrittr 2.0.3 2022-03-30 [1] RSPM (R 4.4.0) #> mapme.biodiversity * 0.8.0.9013 2024-08-16 [1] Github (mapme-initiative/mapme.biodiversity@7e396a5) #> parallelly 1.38.0 2024-07-27 [1] CRAN (R 4.4.1) #> pillar 1.9.0 2023-03-22 [1] RSPM (R 4.4.0) #> pkgconfig 2.0.3 2019-09-22 [1] RSPM (R 4.4.0) #> progressr 0.14.0 2023-08-10 [1] RSPM (R 4.4.0) #> proxy 0.4-27 2022-06-09 [1] CRAN (R 4.4.1) #> purrr * 1.0.2 2023-08-10 [1] RSPM (R 4.4.0) #> R6 2.5.1 2021-08-19 [1] RSPM (R 4.4.0) #> rappdirs 0.3.3 2021-01-31 [1] RSPM (R 4.4.0) #> Rcpp 1.0.13 2024-07-17 [1] RSPM (R 4.4.0) #> reprex 2.1.0 2024-01-11 [1] RSPM (R 4.4.0) #> rlang 1.1.4 2024-06-04 [1] RSPM (R 4.4.0) #> rmarkdown 2.27 2024-05-17 [1] RSPM (R 4.4.0) #> rstudioapi 0.16.0 2024-03-24 [1] RSPM (R 4.4.0) #> s2 1.1.7 2024-07-17 [1] CRAN (R 4.4.1) #> sessioninfo 1.2.2 2021-12-06 [1] RSPM (R 4.4.0) #> sf * 1.0-16 2024-03-24 [1] CRAN (R 4.4.1) #> terra 1.7-78 2024-05-22 [1] CRAN (R 4.4.1) #> tibble 3.2.1 2023-03-20 [1] RSPM (R 4.4.0) #> tidyselect 1.2.1 2024-03-11 [1] RSPM (R 4.4.0) #> units 0.8-5 2023-11-28 [1] CRAN (R 4.4.1) #> utf8 1.2.4 2023-10-22 [1] RSPM (R 4.4.0) #> vctrs 0.6.5 2023-12-01 [1] RSPM (R 4.4.0) #> withr 3.0.1 2024-07-31 [1] RSPM (R 4.4.0) #> wk 0.9.2 2024-07-09 [1] CRAN (R 4.4.1) #> xfun 0.46 2024-07-18 [1] RSPM (R 4.4.0) #> yaml 2.3.10 2024-07-26 [1] RSPM (R 4.4.0) #> #> [1] /usr/local/lib/R/site-library #> [2] /usr/local/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```