mapme-initiative / mapme.biodiversity

Efficient analysis of spatial biodiversity datasets for global portfolios
https://mapme-initiative.github.io/mapme.biodiversity/dev
GNU General Public License v3.0
33 stars 7 forks source link

refactor backend and include register methods #179

Closed goergen95 closed 1 year ago

goergen95 commented 1 year ago

Hi @karpfen,

this PR is quite extensive and besides some long-overdue cleaning of the backend code includes a very cool new feature. There are two new exported functions (register_resource() and register_indicator()) that allows users to supply custom function for both resources and indicators. Internally, we now use the very same mechanism to register the default resource/indicator functions. This also makes it a lot easier to add new functionality since everything that needs to be done to add a new resource/indicator now happens in a single file.

I don't know how familiar you already are with the backend code, but I think now it is way more straight-forward and I would like to ask you if you can review the proposed changes and maybe conduct some tests? I would like to this PR be included in the next CRAN release

Below you find an example how user-side code to add a custom indicator function would look like.

Cheers, Darius.

library(sf)
#> Linking to GEOS 3.11.1, GDAL 3.6.2, PROJ 9.1.1; sf_use_s2() is TRUE
remotes::install_github("mapme-initiative/mapme.biodiversity", ref = "refactor-backend")
#> Skipping install of 'mapme.biodiversity' from a github remote, the SHA1 (9236e41b) has not changed since last install.
#>   Use `force = TRUE` to force installation
library(mapme.biodiversity)

temp_loc <- file.path(tempdir(), "mapme.biodiversity")
if (!file.exists(temp_loc)) {
  dir.create(temp_loc)
  resource_dir <- system.file("res", package = "mapme.biodiversity")
  file.copy(resource_dir, temp_loc, recursive = TRUE)
}
#> [1] TRUE

aoi <- system.file("extdata", "sierra_de_neiba_478140_2.gpkg",
                   package = "mapme.biodiversity"
) %>%
  read_sf() %>%
  init_portfolio(
    years = 2000:2020,
    outdir = file.path(temp_loc, "res"),
    tmpdir = tempdir(),
    add_resources = FALSE,
    verbose = FALSE
  ) %>%
  get_resources("nasa_srtm")

calc_slope <- function(x, nasa_srtm, stat = "mean", ...){

  slope <- terra::terrain(
    nasa_srtm,
    v = "slope",
    unit = "degrees",
    neighbors = 8
  )

  out <- terra::extract(slope, x, fun = stat, na.rm = TRUE, ID = FALSE)
  out <- tibble::tibble(as.numeric(out))
  names(out) <-paste("slope_", stat, sep = "")
  out
}

register_indicator(
  name = "slope",
  resources = list(nasa_srtm = "raster"),
  fun = calc_slope,
  arguments = list(stat = "mean"),
  processing_mode = "asset"
)

aoi %>%
  calc_indicators("slope", stat = "median") %>%
  tidyr::unnest(slope)
#> Simple feature collection with 1 feature and 6 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -71.80933 ymin: 18.57668 xmax: -71.33201 ymax: 18.69931
#> Geodetic CRS:  WGS 84
#> # A tibble: 1 × 7
#>   WDPAID NAME     DESIG_ENG ISO3  assetid slope_median                      geom
#>    <dbl> <chr>    <chr>     <chr>   <int>        <dbl>             <POLYGON [°]>
#> 1 478140 Sierra … National… DOM         1         17.7 ((-71.76134 18.66333, -7…

Created on 2023-08-04 with reprex v2.0.2

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 75.88% and project coverage change: -4.68% :warning:

Comparison is base (bcdca79) 77.98% compared to head (6539af4) 73.31%.

:exclamation: Current head 6539af4 differs from pull request most recent head 90edb69. Consider uploading reports for the commit 90edb69 to get more accurate results

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #179 +/- ## ========================================== - Coverage 77.98% 73.31% -4.68% ========================================== Files 42 41 -1 Lines 1994 1705 -289 ========================================== - Hits 1555 1250 -305 - Misses 439 455 +16 ``` | [Files Changed](https://app.codecov.io/gh/mapme-initiative/mapme.biodiversity/pull/179?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mapme-initiative) | Coverage Δ | | |---|---|---| | [R/get\_chirps.R](https://app.codecov.io/gh/mapme-initiative/mapme.biodiversity/pull/179?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mapme-initiative#diff-Ui9nZXRfY2hpcnBzLlI=) | `56.25% <ø> (ø)` | | | [R/get\_esalandcover.R](https://app.codecov.io/gh/mapme-initiative/mapme.biodiversity/pull/179?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mapme-initiative#diff-Ui9nZXRfZXNhbGFuZGNvdmVyLlI=) | `56.89% <ø> (ø)` | | | [R/get\_fritz\_et\_al.R](https://app.codecov.io/gh/mapme-initiative/mapme.biodiversity/pull/179?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mapme-initiative#diff-Ui9nZXRfZnJpdHpfZXRfYWwuUg==) | `27.02% <ø> (ø)` | | | [R/get\_gfw\_emissions.R](https://app.codecov.io/gh/mapme-initiative/mapme.biodiversity/pull/179?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mapme-initiative#diff-Ui9nZXRfZ2Z3X2VtaXNzaW9ucy5S) | `78.57% <ø> (-21.43%)` | :arrow_down: | | [R/get\_gfw\_lossyear.R](https://app.codecov.io/gh/mapme-initiative/mapme.biodiversity/pull/179?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mapme-initiative#diff-Ui9nZXRfZ2Z3X2xvc3N5ZWFyLlI=) | `53.84% <ø> (-11.54%)` | :arrow_down: | | [R/get\_gfw\_treecover.R](https://app.codecov.io/gh/mapme-initiative/mapme.biodiversity/pull/179?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mapme-initiative#diff-Ui9nZXRfZ2Z3X3RyZWVjb3Zlci5S) | `83.33% <ø> (-10.00%)` | :arrow_down: | | [R/get\_gmw.R](https://app.codecov.io/gh/mapme-initiative/mapme.biodiversity/pull/179?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mapme-initiative#diff-Ui9nZXRfZ213LlI=) | `28.88% <0.00%> (-1.35%)` | :arrow_down: | | [R/get\_nasa\_firms.R](https://app.codecov.io/gh/mapme-initiative/mapme.biodiversity/pull/179?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mapme-initiative#diff-Ui9nZXRfbmFzYV9maXJtcy5S) | `46.87% <ø> (ø)` | | | [R/get\_nasa\_grace.R](https://app.codecov.io/gh/mapme-initiative/mapme.biodiversity/pull/179?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mapme-initiative#diff-Ui9nZXRfbmFzYV9ncmFjZS5S) | `70.83% <ø> (ø)` | | | [R/get\_nasa\_srtm.R](https://app.codecov.io/gh/mapme-initiative/mapme.biodiversity/pull/179?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mapme-initiative#diff-Ui9nZXRfbmFzYV9zcnRtLlI=) | `48.00% <ø> (ø)` | | | ... and [30 more](https://app.codecov.io/gh/mapme-initiative/mapme.biodiversity/pull/179?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mapme-initiative) | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

karpfen commented 1 year ago

Hi @goergen95, this looks really cool! I'll probably need a couple of days to sift through this, but I'll get back to you as soon as I've got any news or questions.

Cheers, Andreas

karpfen commented 1 year ago

Hi @goergen95, I just finished taking a look at your changes. Looks really neat to me, great work! I found no functional problems, just one or two suggestions. I Proposed a few suggestions for wording and structuring which I think make things a bit more concise. Feel free to mix and match there.

Note: I had to define utils::globalVariables(c("anomaly")) in order to run devtools::check() over quickstart.Rmd. But I think this is as expected.

AAA_register.R

General: Is the filename AAAregister.R intentional or is the "AAA" part a leftover from development? I also suggest to split the file into register_resource.R and register_indicator.R (resp. AAA_register_resource.R and AAA_registerindicator.R if the AAA was intentional).

l. 9 Proposed wording: Registers a custom resource function to access data or functionality otherwise not native to \code{mapme.biodiversity} to be used with \code{get_resources}. Custom resources will also be registered in the list generated by \code{available_resources}.

Note that registering your own resource function will only have effect for the current R session. If you return to your analysis in a new session, you will have to re-register your custom resource.


l. 88 Proposed wording: Registers a custom indicator function to access indicators not native to \code{mapme.biodiversity} to be used with \code{calc_indicators}. Custom indicators will also be registered in the list generated by \code{available_indicators}.

Note that registering your own indicator function will only have effect for the current R session. If you return to your analysis in a new session, you will have to re-register your custom indicator.


l. 111 typo treecover_arae -> treecover_area


l. 133 maybe add a check "nchar(name) > 0" to catch "" as name. Right now it's possible to register resources/indicators with name "". I don't see any direct problems with this, but I'm not sure it can cause problems later on.


l. 134 Wording. From reading the message I thought argument name must be a 1-character string like "a". Proposed wording: name needs to be one character string.


l. 190 Proposed wording: This function returns a list of resource names and parametrization options for the specified resources. If no resource names are provided, it lists all available resources. Use it to learn about possible additional arguments that can be specified when requesting a resource.

@param resources If \code{NULL} returns a list of all resources. Otherwise only the ones specified.


l. 220 Proposed wording: This function returns a list of indicator names and parametrization options for the specified indicators. If no indicator names are provided, it lists all available indicators. Use it to learn about possible additional arguments that can be specified when requesting a indicator.

@param indicators If \code{NULL} returns a list of all indicators. Otherwise only the ones specified.


get_resource.R

goergen95 commented 1 year ago

Hi, thank you very much for having a detailed look! I will adopt the wording according to your suggestions.

The AAA_register.R filename is not a development artifact, but since R seems to evaluate the files within R/ in alphabetical order we need to have declared the register functions and the package environment before calling them. I am unsure if there exists a better mechanism, but this is what currently seems to work best. Feel free to suggest something else if you have an idea? Thus I would also currently rather not split up the file into parts...

goergen95 commented 1 year ago

The issue with the undeclared anomaly variable is new to me and I can't reproduce. Would you please share your sessionInfo()?

karpfen commented 1 year ago

Ok, I tried reproducing the issue with the anomaly variable, but now it works and I can't get it to un-work again. So problem solved I guess :) Just if you're interested, it was essentially this issue: https://community.rstudio.com/t/devtools-check-error-with-vignette-but-vignette-can-be-knitted-without-problem/55962

Regarding the load order: this can be determined using Collate in DESCRIPTION. I think there's two ways to do this:

goergen95 commented 1 year ago

Great, thanks!

CRAN policies suggest that all files must be present in the Collate field (see screenshot).

cran-collate

So I guess including the @include register.R tag in each file is the best way to achieve this automatically..

goergen95 commented 1 year ago

Just another question, have you tested this PR with parallelization? I did on my machine, and it worked with different plan options. I just would like to make sure that parallelizations is not affected by these changes...

karpfen commented 1 year ago

@goergen95 I just tried parallelization on Windows with the example from the README and it worked as expected.

Btw, I found two typos in the README:

goergen95 commented 1 year ago

Great! Ready for merge then, I`d say? Thanks for your feedback!!

(P.S.: PRs for docu stuff are always welcome ;) )