prioritizr / prioritizr

Systematic conservation prioritization in R
https://prioritizr.net
122 stars 25 forks source link

CBC + prioritizr crashes with sf #247

Closed jaseeverett closed 2 years ago

jaseeverett commented 2 years ago

I have come across a strange issue. We are using the CBC solver for a prioritisation question that involves about 60 features in sf format. As part of this, the end-user will run lots of different combinations of the data. When only one feature is included, R crashes for more than half of the features. After lots and lots of interrogation of the data I have discovered that this occurs when the last planning unit in the scenario contains a 0 for the input feature.

I have created a reprex in the attached zip. If you run the first part of the code, the analysis crashes R. If you then skip to the second part (where I add a 1 to last planning unit) it runs without a problem. If I change the code to use Gurobi, it runs ok.

I am unsure as to whether this is related to the CBC solver itself, or if this is the prioritizr implementation? Unfortunately R crashes before it gives any error information so it is hard to know where the error is.

I have tried to recreate this using the PUs in the prioritizr package (sim_pu_sf) with some dummy data, but I can't. Perhaps the issue is related to the large number of PUs I have (>15K) or the CRS we are using.

CBCprioritizrIssue.zip

jeffreyhanson commented 2 years ago

Thanks very much for reporting this! I encountered something similar ages ago, but wasn't able to pin it down so precisely - nice work! I'll try and push a branch with a bug fix before the end of today.

jeffreyhanson commented 2 years ago

@jaseeverett, could you please try installing the cranrel2 branch and see if that fixes it?

jaseeverett commented 2 years ago

Thanks @jeffreyhanson. I have installed the branch. The short answer is that it doesn't crash anymore with a zero in the last cell. Thanks!! But I get new presolve_check errors that didn't occur previously.

I understand the reason for them, but it seems I can't turn the presolve_check off for sf objects? Can you clarify?

I tried just the first problem line,

p1 <- prioritizr::problem(dat, features = "V1", cost_column = "Cost_Total", run_checks = FALSE)

and it errors with

Error: unused arguments

Is turning the checks off supported for sf objects? The help says it is for sf,Raster but not for sf,character

## S4 method for signature 'sf,Raster'
problem(x, features, cost_column, run_checks, ...)

## S4 method for signature 'sf,character'
problem(x, features, cost_column, ...)

I'm unsure of the meaning of the second class (ie ,character). Is this the format of the features? Is there a way to turn the presolve checks off for my data. Is this something you would consider adding?

Unfortunately this fix is going to cause trouble because some of my problem combinations will not run, even if we want to at least see what the solution is. The area is very data poor so in some instances there is not much we can do when we are testing just a few features at a time.

Should we close this issue and I'll open another for turning off the presolve checks for sf?.

Thanks again for your rapid response.

jeffreyhanson commented 2 years ago

Yeah, this error is happening because problem(..., run_checks = FALSE) parameter is only available when the planning unit and feature data belong to certain classes/data types. Since the run_checks parameter is not available in problem() when x=sf,y=character data types, R is saying there's an unused argument when you specify run_checks = FALSE. To answer your question about turning off checks, the run_checks parameter in the problem() function controls whether preliminary checks should be done on the planning unit and feature data to ensure they contain valid values (e.g. all cost values aren't NA). Since these preliminary checks can take a long time for raster data, the run_checks parameter can be used to skip them (they don't take so long for other data types, so we didn't bother adding in the ability to skip them).

The "preliminary checks" -- which are controlled via problem(..., run_checks = FALSE) -- are different from the "presolve checks" though. The preliminary checks are run on the planning unit and feature data when first initializing the problem(), and the presolve checks are when building the mathematical optimization problem for the solver (e.g. when you use solve()). To skip the presolve checks, you can use solve(x, run_checks = FALSE). You should be able to skip the presolve checks -- regardless of the planning unit or features data types used to build the problem -- using the run_checks parameter in the solve() function---so if you see errors here then that's definitely a bug so please let me know!

So, I would suggest adapting your example like this to skip the presolve checks:

s1 <- prioritizr::problem(dat, "V1", "Cost_Total") %>%
  prioritizr::add_min_set_objective() %>%
  prioritizr::add_relative_targets(0.2) %>%
  prioritizr::add_binary_decisions() %>%
  prioritizr::add_cbc_solver(verbose = FALSE) %>%
  prioritizr::solve(run_checks = FALSE) %>%
  st_as_sf()

Does that help? Please let me know if anything's still unclear, or if you have any further questions?

jaseeverett commented 2 years ago

Thanks Jeff. That makes sense. I wasn't aware there were two types of checks. I will move the run_checks to the solve call as you suggest. As an aside, have you re-enabled these checks in this latest version? I haven't encountered this previously with this dataset, even for features that didn't have a zero in the last cell. I am happy for you to close this issue now. I will reopen if I encounter further issues.

jeffreyhanson commented 2 years ago

Excellent! Sorry, I'm not sure what you mean by re-enalbed these checks? The developmental version on GitHub has had several updates to improve the presolve checks that aren't on the CRAN version yet. In partiuclar, the checks present in the current CRAN version tend to flag a lot of false positives (i.e. it says there could be issues when everything's fine), and the GitHub version has been updated to help avoid this. Does that help answer your question?

jaseeverett commented 2 years ago

Thanks

jeffreyhanson commented 2 years ago

We've just merged the PR to fix the rcbc crashing bug and so this automatically closed the issue. Please feel free to re-open the issue or continue posting if you had any other questions?