Closed goergen95 closed 5 months ago
Oh, man! That looks awesome!!!!
Attention: 33 lines
in your changes are missing coverage. Please review.
Comparison is base (
5b361b1
) 76.01% compared to head (ade94a4
) 75.33%.:exclamation: Current head ade94a4 differs from pull request most recent head e5cdfdf. Consider uploading reports for the commit e5cdfdf to get more accurate results
Files | Patch % | Lines |
---|---|---|
R/calc_treecover_area.R | 83.33% | 16 Missing :warning: |
R/calc_treecover_area_and_emissions.R | 79.41% | 7 Missing :warning: |
R/calc_treecoverloss_emissions.R | 78.26% | 5 Missing :warning: |
R/utils.R | 16.66% | 5 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Just to provide some evidence, here is a somewhat extrem use case. I simply took the bounding box of a large PA in Brazil (WDPAID = 33613) and compared the two routines.
Results: The new routine is 51 times faster on my machine for this AOI and the difference in the treecover estimation is about 0.5% (3543 ha)! :tada:
Current routine:
remotes::install_github("mapme-initiative/mapme.biodiversity", ref = "main")
#> Skipping install of 'mapme.biodiversity' from a github remote, the SHA1 (7b0a2173) has not changed since last install.
#> Use `force = TRUE` to force installation
library(sf)
#> Linking to GEOS 3.11.1, GDAL 3.8.1, PROJ 9.1.1; sf_use_s2() is TRUE
library(mapme.biodiversity)
aoi <- "POLYGON ((-62.2827 -13.5376, -61.1715 -13.5376, -61.1715 -12.8, -62.2827 -12.8, -62.2827 -13.5376))"
aoi <- st_as_sfc(aoi, crs = st_crs("EPSG:4326")) |> st_as_sf()
area <- st_area(aoi)
units::set_units(area, "km²")
#> 9867.781 [km²]
outdir <- tempfile()
dir.create(outdir)
aoi <- init_portfolio(aoi, years = 2000:2022, outdir = outdir)
aoi <- get_resources(aoi, resources = c("gfw_treecover", "gfw_lossyear"),
vers_treecover = "GFC-2022-v1.10",
vers_lossyear = "GFC-2022-v1.10")
#> Starting process to download resource 'gfw_treecover'........
#> Starting process to download resource 'gfw_lossyear'........
timing <- system.time(aoi <- calc_indicators(aoi, "treecover_area"))
#> Argument 'min_size' for resource 'treecover_area' was not specified. Setting to default value of '10'.
#> Argument 'min_cover' for resource 'treecover_area' was not specified. Setting to default value of '35'.
aoi$treecover_area[[1]]
#> # A tibble: 23 × 2
#> years treecover
#> <int> <dbl>
#> 1 2000 610823.
#> 2 2001 610089.
#> 3 2002 608337.
#> 4 2003 600961.
#> 5 2004 597900.
#> 6 2005 592184.
#> 7 2006 590614.
#> 8 2007 589804.
#> 9 2008 588828.
#> 10 2009 588093.
#> # ℹ 13 more rows
timing
#> user system elapsed
#> 1065.052 17.670 1082.735
Created on 2023-12-19 with reprex v2.0.2
New routine:
remotes::install_github("mapme-initiative/mapme.biodiversity", ref = "speed-up-gfw-routines")
#> Skipping install of 'mapme.biodiversity' from a github remote, the SHA1 (2bb12db1) has not changed since last install.
#> Use `force = TRUE` to force installation
library(sf)
#> Linking to GEOS 3.11.1, GDAL 3.8.1, PROJ 9.1.1; sf_use_s2() is TRUE
library(mapme.biodiversity)
aoi <- "POLYGON ((-62.2827 -13.5376, -61.1715 -13.5376, -61.1715 -12.8, -62.2827 -12.8, -62.2827 -13.5376))"
aoi <- st_as_sfc(aoi, crs = st_crs("EPSG:4326")) |> st_as_sf()
area <- st_area(aoi)
units::set_units(area, "km²")
#> 9867.781 [km²]
outdir <- tempfile()
dir.create(outdir)
aoi <- init_portfolio(aoi, years = 2000:2022, outdir = outdir)
aoi <- get_resources(aoi, resources = c("gfw_treecover", "gfw_lossyear"),
vers_treecover = "GFC-2022-v1.10",
vers_lossyear = "GFC-2022-v1.10")
#> Starting process to download resource 'gfw_treecover'........
#> Starting process to download resource 'gfw_lossyear'........
timing <- system.time(aoi <- calc_indicators(aoi, "treecover_area"))
#> Argument 'min_size' for resource 'treecover_area' was not specified. Setting to default value of '10'.
#> Argument 'min_cover' for resource 'treecover_area' was not specified. Setting to default value of '35'.
aoi$treecover_area[[1]]
#> # A tibble: 23 × 2
#> years treecover
#> <int> <dbl>
#> 1 2000 614366.
#> 2 2001 613628.
#> 3 2002 611866.
#> 4 2003 604446.
#> 5 2004 601365.
#> 6 2005 595615.
#> 7 2006 594036.
#> 8 2007 593221.
#> 9 2008 592239.
#> 10 2009 591500.
#> # ℹ 13 more rows
timing
#> user system elapsed
#> 16.653 4.430 21.083
Created on 2023-12-19 with reprex v2.0.2
Here is another comparison when using a grid. Results indicate that processing time is reduced by half, differences in area estiamation for grid cell number 6 is about 1.7% (8ha).
Current routine:
remotes::install_github("mapme-initiative/mapme.biodiversity", ref = "main")
#> Skipping install of 'mapme.biodiversity' from a github remote, the SHA1 (7b0a2173) has not changed since last install.
#> Use `force = TRUE` to force installation
library(sf)
#> Linking to GEOS 3.11.1, GDAL 3.8.1, PROJ 9.1.1; sf_use_s2() is TRUE
library(future)
library(progressr)
library(mapme.biodiversity)
aoi <- "POLYGON ((-62.2827 -13.5376, -61.1715 -13.5376, -61.1715 -12.8, -62.2827 -12.8, -62.2827 -13.5376))"
aoi <- st_as_sfc(aoi, crs = st_crs("EPSG:4326")) |> st_as_sf()
aoi <- st_make_grid(aoi, cellsize = c(0.025, 0.025)) |> st_as_sf()
area <- st_area(aoi)
mean(units::set_units(area[1], "km²"))
#> 7.513411 [km²]
outdir <- tempfile()
dir.create(outdir)
aoi <- init_portfolio(aoi, years = 2000:2022, outdir = outdir)
aoi <- get_resources(aoi, resources = c("gfw_treecover", "gfw_lossyear"),
vers_treecover = "GFC-2022-v1.10",
vers_lossyear = "GFC-2022-v1.10")
#> Starting process to download resource 'gfw_treecover'........
#> Starting process to download resource 'gfw_lossyear'........
plan(multisession, workers = 6)
with_progress({
timing <- system.time(aoi <- calc_indicators(aoi, "treecover_area"))
})
#> Argument 'min_size' for resource 'treecover_area' was not specified. Setting to default value of '10'.
#> Argument 'min_cover' for resource 'treecover_area' was not specified. Setting to default value of '35'.
plan(sequential)
aoi$treecover_area[[6]]
#> # A tibble: 23 × 2
#> years treecover
#> <int> <dbl>
#> 1 2000 452.
#> 2 2001 452.
#> 3 2002 450.
#> 4 2003 450.
#> 5 2004 450.
#> 6 2005 450.
#> 7 2006 450.
#> 8 2007 450.
#> 9 2008 450.
#> 10 2009 450.
#> # ℹ 13 more rows
timing
#> user system elapsed
#> 6.804 0.312 172.218
Created on 2023-12-21 with reprex v2.0.2
New routine:
remotes::install_github("mapme-initiative/mapme.biodiversity", ref = "speed-up-gfw-routines")
#> Skipping install of 'mapme.biodiversity' from a github remote, the SHA1 (2bb12db1) has not changed since last install.
#> Use `force = TRUE` to force installation
library(sf)
#> Linking to GEOS 3.11.1, GDAL 3.8.1, PROJ 9.1.1; sf_use_s2() is TRUE
library(future)
library(progressr)
library(mapme.biodiversity)
aoi <- "POLYGON ((-62.2827 -13.5376, -61.1715 -13.5376, -61.1715 -12.8, -62.2827 -12.8, -62.2827 -13.5376))"
aoi <- st_as_sfc(aoi, crs = st_crs("EPSG:4326")) |> st_as_sf()
aoi <- st_make_grid(aoi, cellsize = c(0.025, 0.025)) |> st_as_sf()
area <- st_area(aoi)
mean(units::set_units(area[1], "km²"))
#> 7.513411 [km²]
outdir <- tempfile()
dir.create(outdir)
aoi <- init_portfolio(aoi, years = 2000:2022, outdir = outdir)
aoi <- get_resources(aoi, resources = c("gfw_treecover", "gfw_lossyear"),
vers_treecover = "GFC-2022-v1.10",
vers_lossyear = "GFC-2022-v1.10")
#> Starting process to download resource 'gfw_treecover'........
#> Starting process to download resource 'gfw_lossyear'........
plan(multisession, workers = 6)
with_progress({
timing <- system.time(aoi <- calc_indicators(aoi, "treecover_area"))
})
#> Argument 'min_size' for resource 'treecover_area' was not specified. Setting to default value of '10'.
#> Argument 'min_cover' for resource 'treecover_area' was not specified. Setting to default value of '35'.
plan(sequential)
aoi$treecover_area[[6]]
#> # A tibble: 23 × 2
#> years treecover
#> <int> <dbl>
#> 1 2000 444.
#> 2 2001 444.
#> 3 2002 443.
#> 4 2003 443.
#> 5 2004 443.
#> 6 2005 443.
#> 7 2006 443.
#> 8 2007 443.
#> 9 2008 443.
#> 10 2009 443.
#> # ℹ 13 more rows
timing
#> user system elapsed
#> 3.717 0.138 88.381
Created on 2023-12-21 with reprex v2.0.2
Thanks, I included some styling and linting. Anyway, before merging, I think we should disable the tests for numerical stability on case {landscapemetrics}
is not installed.
Something wrong with Posits package management servers (see here). Will re-run checks once this has settled.
Something wrong with Posits package management servers (see here). Will re-run checks once this has settled.
FYI, it seems to be fixed now: https://fosstodon.org/@jvroberts/111773434263658017
Yep, thanks. Already re-run the checks and it seems to pass. I included now some serious refactoring and it makes the code so much more readable. You might want to take another look? Still thinking about also refactoring the associated tests quiet a bit..
Tests now also refactored, I would be happy with merging this.
@goergen95 What do you think of replacing the installation instructions in the README with remotes::install_github("https://github.com/mapme-initiative/mapme.biodiversity", dependencies = TRUE)
,
resp. install.packages("mapme.biodiversity", dependencies = TRUE)
?
That way the landscapemetrics dependency may become a bit more accessible.
Nice! That was definitely one of the headaches I still had with merging this. 😉
Btw I actually see the possibility to use landscapemetrics for, well, landscape metrics indicators in the future. Once we get a user request for this we might actually implement the respective indicator functions.
This PR reworks the GFW routines and improves the speed of calculation up to 10 times by introducing two main changes. The first is that
exactectractr
is now required for the indicator calculations. It is still included inSUGGESTS
, but users are informed to install it in caserequireNamespace()
returnsFALSE
.The main speed-improvment, however, is achieved by relying on
landscapemetrics::get_patches()
instead ofterra::patches()
. Thus,landscapemetrics
is now also included in SUGGESTS, but the function code wil fallback toterra
if it is not installed and issue a message to the user advising to installlandscapemetrics
for better computation times.