mapme-initiative / mapme.biodiversity

Efficient analysis of spatial biodiversity datasets for global portfolios
https://mapme-initiative.github.io/mapme.biodiversity/dev
GNU General Public License v3.0
33 stars 7 forks source link

Change parallelization strategy to be more robust #137

Closed Jo-Schie closed 1 year ago

Jo-Schie commented 1 year ago

Background: The package currently is not able to implement parallelization on windows machines (#15) and there are also stability issues. The calc_indicator function ocassionally gets stuck and stays there for a long time until it crashes. it seems to be quite arbitrary because sometimes the same input data and same indicators works or stops working. There is no apparent error message and if the calculation is repeated sometimes in might work. There is a issue discussing this here #90.

From a user perspective it is beneficial if the parallization appraoch was made more robust and cross-plattform. There is ongoing discussion on how to solve this problem (#136).

Definition of done The proposed approach is working and tested cross-plattform for Windows/Mac/Linux. A test-script and test-dataset is provided for the "testers".

Complexity low

goergen95 commented 1 year ago

Hi,

PR #138 including parallel computing via the future backend was just merged into main. This also allows using parallel computing on Windows thus closing outstanding issue #15.

Please see here for a repex that can be used for testing. Most importantly, to enable parallel computing a multi-core session needs to be initiated via the future package in user-side code:

plan(multisession, workers = 5)
with_progress({
  portfolio <- calc_indicators(
    portfolio,
    "treecover_area",
    min_cover = 10,
    min_size = 5
  )
})
plan(sequential)

To visualize a progress bar, the calc_indicator() call needs to either be wrapped in progressr::with_progress() (as done above) or as an alternative progressr::handlers(global = TRUE) can be set at the start of a script or elsewhere. Documentation of this approach is included both in the README and the Quickstart vignette.

goergen95 commented 1 year ago

Add documentation for multicore vs. multisession