mapme-initiative / mapme.biodiversity

Efficient analysis of spatial biodiversity datasets for global portfolios
https://mapme-initiative.github.io/mapme.biodiversity/
GNU General Public License v3.0
24 stars 7 forks source link

Can't calculate ESA landcover for specific polygon #208

Closed karpfen closed 7 months ago

karpfen commented 7 months ago

I've encountered a random polygon that could not be processed for the esalandcover indicator. The bug seems similar to the one described in #196, however in this case it should be possible to calculate the indicator, because the necessary data are definitely available and overlapping with the polygon. Any ideas what could be wrong in the example below? It's not the small area of the polygon, as I tried it successfully with even smaller ones that lie within the original polygon. So I suspect there is something specific about this particular geometry, but I can't figure out what it is.

library(sf)
library(mapme.biodiversity)

wkt <- "POLYGON ((-74.5208 10.9139, -74.5207 10.9139, -74.5207 10.9138, -74.5206 10.9138, -74.5206 10.9137, -74.5205 10.9137, -74.5205 10.9136, -74.5206 10.9136, -74.5207 10.9136, -74.5208 10.9137, -74.5209 10.9137, -74.521 10.9137, -74.5211 10.9137, -74.5212 10.9137, -74.5213 10.9137, -74.5213 10.9138, -74.5212 10.9138, -74.5211 10.9138, -74.521 10.9139, -74.5209 10.9139, -74.5208 10.9139))"
x <- data.frame(
  name = "test",
  geom = wkt
)

calculate_landcover <- function(poly) {
  poly_lc <- poly %>%
    init_portfolio(
      2019,
      verbose = TRUE) %>%
    get_resources("esalandcover")
  lc <- calc_indicators(poly_lc, "landcover")
  return(lc$landcover)
}

poly_original <- st_as_sf(x, wkt = "geom", crs = 4326)
st_is_valid(poly_original) # TRUE
poly_buffer <- st_buffer(poly_original , 1)

calculate_landcover(poly_buffer) # this works
calculate_landcover(poly_original) # This doesn't work

# Error in `[<-`:
# ! Assigned data `results[indicator]` must be compatible with existing data.
# ✖ Existing data has 1 row.
# ✖ Assigned data has 0 rows.
# ℹ Row updates require a list value. Do you need `list()` or `as.list()`?
# Caused by error in `vectbl_recycle_rhs_rows()`:
# ! Can't recycle input of size 0 to size 1.
# Run `rlang::last_trace()` to see where the error occurred.
# Warning message:
# In .read_raster_source(x, tindex) : Error : [crop] invalid extent
Output of rlang::last_trace() Error in `[<-`: ! Assigned data `results[indicator]` must be compatible with existing data. ✖ Existing data has 1 row. ✖ Assigned data has 0 rows. ℹ Row updates require a list value. Do you need `list()` or `as.list()`? Caused by error in `vectbl_recycle_rhs_rows()`: ! Can't recycle input of size 0 to size 1. --- Backtrace: ▆ 1. └─global calculate_landcover(poly_original) 2. └─mapme.biodiversity::calc_indicators(poly_lc, "landcover") at worldwide_processing/portfolio_esa_landcover/reprex_error.R:17:2 3. └─mapme.biodiversity:::.get_single_indicator(x, indicator, ...) 4. ├─base::`[<-`(`*tmp*`, indicator, value = ``) 5. └─tibble:::`[<-.tbl_df`(`*tmp*`, indicator, value = ``) 6. └─tibble:::tbl_subassign(x, i, j, value, i_arg, j_arg, substitute(value)) 7. └─tibble:::vectbl_recycle_rhs_rows(value, fast_nrow(xo), i_arg = NULL, value_arg, call)
Output of sessionInfo() R version 4.3.1 (2023-06-16 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 11 x64 (build 22621) Matrix products: default Random number generation: RNG: L'Ecuyer-CMRG Normal: Inversion Sample: Rejection locale: [1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United States.utf8 LC_NUMERIC=C [5] LC_TIME=English_United States.utf8 time zone: Europe/Berlin tzcode source: internal attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] purrr_1.0.2 mapme.biodiversity_0.4.0.9000 sf_1.0-14 RPostgres_1.4.5 loaded via a namespace (and not attached): [1] s2_1.1.4 utf8_1.2.4 future_1.33.0 generics_0.1.3 tidyr_1.3.0 class_7.3-22 KernSmooth_2.23-22 stringi_1.7.12 [9] listenv_0.9.0 hms_1.1.3 digest_0.6.33 magrittr_2.0.3 grid_4.3.1 blob_1.2.4 e1071_1.7-13 DBI_1.1.3 [17] fansi_1.0.5 codetools_0.2-19 cli_3.6.1 rlang_1.1.1 units_0.8-4 parallelly_1.36.0 bit64_4.0.5 withr_2.5.2 [25] tools_4.3.1 parallel_4.3.1 dplyr_1.1.3 globals_0.16.2 curl_5.1.0 vctrs_0.6.3 R6_2.5.1 proxy_0.4-27 [33] lifecycle_1.0.4 classInt_0.4-10 stringr_1.5.0 bit_4.0.5 furrr_0.3.1 pkgconfig_2.0.3 progressr_0.14.0 terra_1.7-55 [41] pillar_1.9.0 glue_1.6.2 Rcpp_1.0.11 tibble_3.2.1 tidyselect_1.2.0 rstudioapi_0.15.0 wk_0.9.0 compiler_4.3.1
karpfen commented 7 months ago

Update: a simple st_make_valid fixes this particular problem (despite the polygon being valid in the first place). I'm leaving the issue open for now, in case this strange error is indicative of some actual bug.

goergen95 commented 7 months ago

Since merging #199 this is now properly caught and a warning about 0-length tibbles is thrown. ~The reason seems to be that the original polygon does not cover the centroid of the raster pixel below it. The buffered version seems to cover it, thus the area for the cell is returned.~ However, the polygon is actually only 1/10th of the size of the cell size, thus the returned area is larger than the polygon area. We would have to think about how we would like to handle such cases. The issue, however, is with the implementation of the .calc_landcover() function.

EDIT: Actually, when applying crop, the cell right below the original polygon is returned, for whatever reason. Then, when applying mask, the original polygon does not touch that cell while the buffered one does. See below:

Screenshot from 2023-11-29 14-15-08

Issue raised in rspatial/terra#1357

goergen95 commented 7 months ago

Setting snap="out" crops the raster to the four closest cells in this case. Now we only need to figure out how we would like to handle the area estimation properly in this case.