New `add_rank_importance()` function for budget-limited prioritizations

jeffreyhanson commented 4 months ago

This feature proposal is to make it easy for users to generate rank importance measures (similar to Figure XX in https://doi.org/10.1038/s41559-021-01528-7). I would call this the "silver" standard approach for calculating importance measures for budget-limiterd prioritizations (with the "gold" standard being the replacement cost method). Although this new method won't be as informative as the replacement cost method and only work with one management zone, this will scale to much larger problems than the replacement cost method.

In terms of function inputs, I think the user would need to input either (1) a problem, a solution, and a series of budget increments, or (2) a problem, a solution, and an integer specifying the number of budget increments. I'm not sure what would be better? Option 1 offers more flexibility, but option 2 is probably what the vast majority of users would want. Maybe we could do something clever, e.g., if a user inputs a single number then treat it as the number of desired increments, and if the user inputs a vector with multiple values then treat them as the desired budget increments? Or maybe that would be confusing?

How does that sound? Any one have any suggestions, ideas for refining this, or concerns?

Here's a reprex showing how the ranks could be generated for a given problem. Note that although the example below has ranks such that lower values are more important, we would want to modify this so that ranks with higher values are more important to follow conventions with the other importance methods in prioritizr.

# Initialization
## set seed
set.seed(500)

## load packages
library(terra)
library(prioritizr)

## set parameters
### set budget for prioritization as a proprotion of the total costs
budget_prop <- 0.3
### set relative targets for features
target_prop <- 0.8
#### set number of budget increments
n_budget_increments <- 10

# Preliminary processing
## load data
pu_data <- get_sim_pu_raster()
feature_data <- get_sim_features()

## define weights for features
feature_weights <- runif(nlyr(feature_data), min = 0.4, max = 0.9)

## calculate total budget
budget <- global(pu_data, "sum", na.rm = TRUE)[[1]] * budget_prop

## calculate budgets for incremental prioritizations
budget_increments <- seq(0, budget, length.out = n_budget_increments + 1)[-1]

# Main processing
## build initial prioritization with total budget
p_initial <-
  problem(pu_data, feature_data) %>%
  add_min_shortfall_objective(budget) %>%
  add_feature_weights(feature_weights) %>%
  add_relative_targets(target_prop) %>%
  add_binary_decisions() %>%
  add_default_solver(gap = 0)

## generate initial solution
s_initial <- solve(p_initial)
names(s_initial) <- "prioritization"

## determine which planning units are not selected by prioritization
pu_locked_out <- 1 - s_initial

## create base problem for incremental prioritizations
p_base <-
  problem(pu_data, feature_data) %>%
  add_feature_weights(feature_weights) %>%
  add_relative_targets(target_prop) %>%
  add_locked_out_constraints(pu_locked_out) %>%
  add_binary_decisions() %>%
  add_default_solver(gap = 0)

## generate incremental prioritizations
s_increments <- list()
for (i in seq_along(budget_increments)) {
  ### problem for budget b
  curr_p <-
    p_base %>%
    add_min_shortfall_objective(budget_increments[[i]])
  if (i > 1) {
    curr_p <-
      curr_p %>%
      add_locked_in_constraints(s_increments[[i - 1]])
  }
  ### solve problem
  s_increments[[i]] <- solve(curr_p)
}
s_increments <- rast(s_increments)

## convert incremental prioritizations to ranks
s_ranks <- which.lyr(s_increments > 0.5)
names(s_ranks) <- "ranks"

# Exports
## plot initial prioritization and ranks
## note that planning units with a lower rank value are more important
plot(c(s_initial, s_ranks))

fig

ricschuster commented 3 months ago

Thanks very much Jeff! I like that idea a lot. A 'silver" standard for problems with lots of pu's would be really great to have.

Regarding the function inputs I do like this idea "if a user inputs a single number then treat it as the number of desired increments, and if the user inputs a vector with multiple values then treat them as the desired budget increments" I don't think that would be too confusing.

Also agreed on higher ranks should be higher importance.

No concerns from my end.

jeffreyhanson commented 3 months ago

Awesome - thanks @ricschuster!

We could try some clever implementation where the user has to be explicit about what the values mean, e.g., where add_rank_importance(n = XXX) or add_rank_importance(budgets = XXX) will work, but add_rank_importance(XXX) will throw an error. I think some dplyr functions do this.

ricschuster commented 3 months ago

We could try some clever implementation where the user has to be explicit about what the values mean, e.g., where add_rank_importance(n = XXX) or add_rank_importance(budgets = XXX) will work, but add_rank_importance(XXX) will throw an error. I think some dplyr functions do this.

That is a great way to implement this!

prioritizr / prioritizr

New `add_rank_importance()` function for budget-limited prioritizations #337