mapme-initiative / mapme.biodiversity

Efficient analysis of spatial biodiversity datasets for global portfolios
https://mapme-initiative.github.io/mapme.biodiversity/dev
GNU General Public License v3.0
33 stars 7 forks source link

`Error : [crop] extents do not overlap` during the execution of the `calc_indicators()` function for forest cover area and forest loss #117

Closed Shirobakaidou closed 1 year ago

Shirobakaidou commented 1 year ago

I wanted to calculate forest loss for Madagascar. When the calc_indicators() function was executed, an error message Error : [crop] extents do not overlap kept showing up for most of the input polygons, though the execution wasn't stopped by this error message.

My script is as follows:

library(wdpar) library(geodata) library(sf) library(ggplot2) library(mapme.biodiversity) library(tidyverse)

Download gadm of MDG

gadm <- gadm(country = "Madagascar", resolution = 1, level = 0, path = paste0(getwd(),'/data')) %>% st_as_sf()

Download wdpa

wdpa <- wdpa_fetch("Madagascar", wait = TRUE, download_dir = paste0(getwd(), '/data'))

Make Bounding Box

bbox <- st_as_sf(st_as_sfc(st_bbox(wdpa)))

Make Hexagon Grid

for (i in c(5)) { cellarea <- i * (1e+6) cellsize <- 2 * sqrt(cellarea / ((3 * sqrt(3) / 2))) * sqrt(3) / 2 hexa <- st_make_grid(x = bbox, cellsize = cellsize, square = FALSE) st_write(hexa, paste0(getwd(), "/data/grids/hexa_", i, "km2.gpkg"), append= FALSE) }

Load Hexagons

hexa <- st_read(paste0(getwd(), "/data/grids/hexa_5km2.gpkg"))

Crop hexagons to Country

gadm <- st_transform(gadm, crs = st_crs(hexa)) intersection <- unlist(st_intersects(gadm, hexa)) hexa <- hexa[sort(intersection),]

Export cropped Hexagon Grids

write_sf(hexa, paste0(getwd(), "/data/grids/hexa_5km2_sub.gpkg"), append = FALSE)

Load cropped Hexagons

aoi <- st_read(paste0(getwd(), "/data/grids/hexa_5km2_sub.gpkg")) aoi <- st_transform(aoi, "EPSG:4326")

ncores = 12 sample = FALSE if(sample){ set.seed(123) index = sample(1:nrow(aoi), 1000, replace = FALSE) aoi = aoi[index, ] }

aoi = init_portfolio(aoi, years = 2000:2020, outdir = "./data", tmpdir = "./tmp", cores = ncores, verbose = TRUE)

Download Raster Data: Hansen Forest Change

get_resources(x = aoi, resources = c("gfw_treecover", "gfw_lossyear"))

Calculate Indicator

grid_treeloss <- calc_indicators(x = aoi, indicators = "treeloss", min_cover = 10, min_size = 1)

However, on a plot where the input polygon grid is laid over the footprint of the raster images, their extents overlap.

footprint_raster <- st_read(paste0(getwd(), "/data/gfw_treecover/tileindex_gfw_treecover.gpkg"))

ggplot() + geom_sf(data = st_geometry(footprint_raster)) + geom_sf(data = st_geometry(aoi))

The output, on the other side, looks interpretable in spite of the error message.

goergen95 commented 1 year ago

Hi, similiar to my comment over at #115 please try to make your examples as minimal as possible. I was not able to run your code. However, I assume that some of the polygons within the aoi object do not intersect with the GFW resource. For those this error message is returned and their value will be set to NA. If you provide a more minimalistic code sample (e.g. only include one or two polygons) we could try to confirm this assumption.

Shirobakaidou commented 1 year ago

Hello:) I could not determine which polygons would generate this error message, so it's hard to provide a dataset that only includes one or two polygons. By plotting the extent of my vector grids overlaid on the footprints of the raster images, the polygons seem to lay well within the raster extent.

footprint_Hansen_MDG

The aoi in this issue is generated in the same way as the one in issue #116; now that issue #116 could be reproduced, I assume the aoi can be regenerated by the script. So my guess would be that the function calc_indicators() for tree cover and tree loss didn't run properly.

So far, I ran the calc_indicators() function about 6 times for tree indicators, twice worked but was accompanied by the error: [crop] extents do not overlap, and the other executions got stuck or crashed.

I also attached the file to be read as aoi here for download. It would be great if you could run the calculation for tree indicators once because it's getting hard for me to tell what kind of problem occurred.

goergen95 commented 1 year ago

Running the code given below I see an unknown CRS. Something wrong with the GeoPackage?

library(sf)
#> Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.1; sf_use_s2() is TRUE
st_read("hexa_5km2_sub.gpkg")
#> Reading layer `hexa_5km2_sub' from data source 
#>   `/home/darius/Desktop/mapme/mapme.biodiversity/hexa_5km2_sub.gpkg' 
#>   using driver `GPKG'
#> Simple feature collection with 120076 features and 0 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 4165767 ymin: -3162891 xmax: 4873395 ymax: -1537018
#> CRS:           unknown

Created on 2022-12-16 with reprex v2.0.2

Shirobakaidou commented 1 year ago

The CRS was assigned in the next step aoi <- st_transform(aoi, "EPSG:4326")

goergen95 commented 1 year ago

The GeoPackage you provided via GoogleDrive has an unvalid CRS . These are definitely not geographic coordinates. Please check if the code you are running does not contain any errors related to the CRS. Using your GeoPackage and fixing the CRS runs cleanly on my machine for a subset of the grid:

library(geodata)
#> Loading required package: terra
#> terra 1.6.47
library(sf)
#> Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.1; sf_use_s2() is TRUE
remotes::install_github("mapme-initiative/mapme.biodiversity")
#> Skipping install of 'mapme.biodiversity' from a github remote, the SHA1 (a897bef4) has not changed since last install.
#>   Use `force = TRUE` to force installation
library(mapme.biodiversity)
library(tidyverse)

data_dir <- "./issues"
dir.create(data_dir, showWarnings = FALSE)
dir.create(file.path(data_dir, "data"), recursive = T, showWarnings = FALSE)
dir.create(file.path(data_dir, "tmp"), recursive = T, showWarnings = FALSE)

gadm <- gadm(country = "Madagascar",
             resolution = 1,
             level = 0,
             path = file.path(data_dir, "data")) %>% st_as_sf()

aoi <- st_read("hexa_5km2_sub.gpkg")
#> Reading layer `hexa_5km2_sub' from data source 
#>   `/home/darius/Desktop/mapme/hexa_5km2_sub.gpkg' using driver `GPKG'
#> Simple feature collection with 120076 features and 0 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 4165767 ymin: -3162891 xmax: 4873395 ymax: -1537018
#> CRS:           unknown
st_crs(aoi) <- st_crs(6933) # fix CRS
aoi <- st_transform(aoi, "EPSG:4326")
aoi$id <- 1:nrow(aoi)

# restrict analysis to polygons off madagscar mainland
mdg <- st_cast(gadm, "POLYGON")[1,]
#> Warning in st_cast.sf(gadm, "POLYGON"): repeating attributes for all
#> sub-geometries for which they may not be constant
intersection <- st_filter(aoi, mdg)
aoi <- aoi[!aoi$id %in% intersection$id, ]

ncores = 6
aoi = init_portfolio(aoi,
                     years = 2000:2020,
                     outdir = file.path(data_dir, "data"),
                     tmpdir = file.path(data_dir, "tmp"),
                     cores = ncores,
                     verbose = TRUE)

aoi <- get_resources(x = aoi,
                     resources = c("gfw_treecover", "gfw_lossyear"))
#> Argument 'vers_treecover' for resource 'gfw_treecover' was not specified. Setting to default value of 'GFC-2020-v1.8'.
#> Starting process to download resource 'gfw_treecover'........
#> Argument 'vers_lossyear' for resource 'gfw_lossyear' was not specified. Setting to default value of 'GFC-2020-v1.8'.
#> Starting process to download resource 'gfw_lossyear'........

grid_treeloss <- calc_indicators(x = aoi,
                                 indicators = "treecover_area",
                                 min_cover = 10,
                                 min_size = 1)

grid_treeloss |>
  unnest(treecover_area) |>
  pull(treecover) |>
  unique() |>
  sort(decreasing = TRUE) |>
  head()
#> [1] 512.1036 512.0287 511.8734 511.8677 511.8623 511.8619

Created on 2022-12-16 with reprex v2.0.2

Shirobakaidou commented 1 year ago

I could reproduce the processing for a small set of polygons

library(geodata)
#> Warning: package 'geodata' was built under R version 4.1.3
#> Loading required package: terra
#> Warning: package 'terra' was built under R version 4.1.3
#> terra 1.6.47
library(sf)
#> Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1; sf_use_s2() is TRUE
library(mapme.biodiversity)
library(ggplot2)

gadm <- gadm(country = "Madagascar", resolution = 1, level = 0, path = paste0(getwd(),'/data')) %>%
  st_as_sf()

aoi <- st_read(paste0(getwd(), '/data/hexagon/hexagon_5_sqkm_subset.gpkg'))
#> Reading layer `hexagon_5_sqkm_subset' from data source 
#>   `C:\EAGLE\kfw_internship\git_myBranch\MDG\mapme.pa-impact-evaluation\data\hexagon\hexagon_5_sqkm_subset.gpkg' 
#>   using driver `GPKG'
#> Simple feature collection with 120495 features and 0 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 446696.7 ymin: 753897.5 xmax: 1217999 ymax: 2271564
#> CRS:           unknown

#st_crs(aoi) <- st_crs(6933) # fix CRS
aoi <- st_transform(aoi, "EPSG:4326")
aoi$id <- 1:nrow(aoi)

# restrict analysis to polygons in madagscar mainland
mdg <- st_cast(gadm, "POLYGON")[1,]
#> Warning in st_cast.sf(gadm, "POLYGON"): repeating attributes for all
#> sub-geometries for which they may not be constant
intersection <- st_filter(aoi, mdg)
aoi <- aoi[!aoi$id %in% intersection$id, ]

footprint_raster <- st_read(paste0(getwd(), "/data/gfw_treecover/tileindex_gfw_treecover.gpkg"))
#> Reading layer `tileindex_gfw_treecover' from data source 
#>   `C:\EAGLE\kfw_internship\git_myBranch\MDG\mapme.pa-impact-evaluation\data\gfw_treecover\tileindex_gfw_treecover.gpkg' 
#>   using driver `GPKG'
#> Simple feature collection with 3 features and 1 field
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 40 ymin: -30 xmax: 60 ymax: -10
#> Geodetic CRS:  WGS 84

ggplot() + geom_sf(data = st_geometry(footprint_raster)) + geom_sf(data = st_geometry(aoi))

#ncores = 6
aoi = init_portfolio(aoi, years = 2000:2020,
                     outdir = "./data",
                     tmpdir = "./tmp",
                     #cores = ncores,
                     verbose = TRUE)
#> Warning in init_portfolio(aoi, years = 2000:2020, outdir = "./data", tmpdir =
#> "./tmp", : Parallel processing on Windows currently is not supported. Setting
#> number of cores to 1

#aoi <- get_resources(x = aoi,
#                     resources = c("gfw_treecover", "gfw_lossyear"))

grid_treeloss <- calc_indicators(x = aoi,
                                 indicators = "treecover_area",
                                 min_cover = 10,
                                 min_size = 1)

#grid_treeloss |>
#  unnest(treecover_area) |>
#  pull(treecover) |>
#  unique() |>
#  sort(decreasing = TRUE) |>
#  head()

Created on 2022-12-16 with reprex v2.0.2

So I tried to apply the same workflow on the large data set again. (To my delight I must say XD), the error: [crop] extents do not overlap didn't show up this time, but it made me more confused about why the error message occurred because I don't see difference between today's run and the previous run where the error occurred. (A rendered html file can be found here as the error occurred during processing)

But anyway, I would suppose that the error was not raised by the unknown CRS in the original geopackage because it was assigned to WGS84 before being used to calculate tree indicators. Besides, it can be seen in the html file that the aoi lays well within the footprints of the raster tiles and that the error was not raised to all of the input polygons, meaning that part of the processing worked without an error.

Though without the reported error, today's processing on the large dataset crashed unluckily again in the end, but it probably should belong to another issue. Since I'm not able to reproduce this error at the moment, it's fine if you close this issue. Thank you for taking a look into it:D