Closed ricschuster closed 2 years ago
Awesome - thanks!
@ricschuster, I've just pushed a new commit to this repo with the new rcbc package version. This won't affect anything when running the benchmarks on your Ubuntu system, but if you tried running the benchmarks on your Windows system, then it will use CBC version 2.10.5. So, hopefully, we should now see the same results on Windows as Ubuntu?
Thanks very much @jeffreyhanson
Maybe I should test a reduced benchmark version on Windows.
A CBC question for you: What do Windows users need to do to use CBC 2.10.5 at this point? I basically want to get a sense if its easy enough so we can promote the solver to prioritizr
users.
No worries - yeah, that's a great idea.
Sorry, I wasn't clear. RWinLib now has the Windows binary files for CBC 2.10.5 and rcbc has been updated to use these binary files (instead of the older version). So, all a Windows users needs to do is install rcbc from GitHub (e.g. using remotes::install_github('dirkschumacher/rcbc')
). Note that they still need to have Rtools installed on their computer.
Thanks very much. Just needing Rtools seems like a pretty reasonable requirement.
What do you think about promoting the new version of prioritizr
on Twitter and to the Marxan group?
We could wait for rcbc
to go to CRAN, but as you noted earlier, that might take quite some time.
Yeah - great idea! I think the new version of prioritizr has some nice QOL features (e.g. the evaluation functions) that users might appreciate? Should the new CBC solver functionality also be promoted too? On the one hand, it might be worth waiting till the benchmark vignette is finished before we promote the new CBC functionality --- so we have some stats/graphs for interested readers? On the other hand, I guess there's no harm in saying that preliminary analysis shows that it's pretty fast and it might help people that can't use Gurobi/CPLEX? What do you think? Also, if it's helpful, I could compile a list of the main benefits/new features for you to include when promoting it?
Sorry, I just had a thought - it's probably worth checking that the CBC solver functionality is pretty fast on Windows with CBC 2.10.5 before promoting it? I think/suspect/guess a good chunk of the users would have Windows systems, so it's probably worth making sure that the performance we see on Ubuntu is still achieved on Windows?
If you could compile a list of the main benefits/new features, that would be awesome!
For CBC it would be good for benchmarks to be finished for sure. That's going slow right now, only 4 scenarios completed today (167/480 complete). Great idea about testing CBC on Windows first. I will do that. Fingers crossed its comparable to Ubuntu.
Ok - sounds great - thank you so much for leading the benchmark stuff!
Here's a list of the main benefits/new features in prioritizr (in order of importance according to user needs, this just my opinion though):
add_cbc_solver
to generate problems with the blazing fast, open source CBC solver (https://prioritizr.net/reference/add_cbc_solver.html).add_lpysmphony_solver
and add_rsymphony_solver
functions so their gap
parameter specifies the relative optimality gap (similar to the Gurobi and CPLEX solvers). This is more of a bug fix than a "new feature" per se -- so we should probably be careful how we describe this?add_lpsymphony_solver
to be more memory efficientThanks very much Jeff!
The benchmarking is taking a lot longer than I'd hoped for, primarily related to boundary penalty factors and open source solvers.
I think it would still be good to show benchmark results when we promote the new prioritizr
version, but that might take a while. Do you think we should send something around now just for the updates or do you also think we should wait for the benchmark vignette?
Yeah, I'm happy with either approach? So what ever you think is best? If you're not sure what to do either, then maybe we could set a time limit? E.g. we could see if the benchmark vignette can be completed in 2 weeks, and if it does, then let's announce the new version along with the benchmark. And if the benchmark vignette is still a work-in-progress by then, let's announce the new version anyway?
Totally up to you though - I just wanted to suggest a third option in case it's helpful
I like it! Two weeks it is.
I've done a bit more testing now and it looks like using add_min_shortfall_objective
in combination with add_boundary_penalties
causes issues, because the boundary_penalty
values we set are meant for add_min_set_objective
. Even Gurobi has a really hard time to find a solution with pus = 12902
. Do you have any ideas how to find better boundary_penalty
values for add_min_shortfall_objective
?
Yeah - I think you're exactly correct. I think we might have to set different BLM values for each objective. I could update the parameter file to allow you to do this - what do you think? If I did that, would you be able to play around with different values for the min shortfall objective?
Thanks Jeff! Lets set different BLM's per objective. Do you think BLM's would scale with #pu's (i.e. test on small set and be okay on bigger set), or might there be a scale issue with BLM as well?
Ok sounds good. Hmmm, it might work (since the targets are also relative). Couldn't hurt I guess? Would it be easier to impose a (relatively) short time limit to make sure that a given set of BLM values work for the easier runs (e.g. less PUs) of the benchmark analysis? E.g. if a given set of BLM values don't work for even the easier runs, then we know they definitely need changing to work for the larger BLM values?
Yeah, I was thinking about the time limit route as well. I was going to just use Gurobi and explore and set BLM values before expanding to other solvers.
Ok - yeah that sounds good. Excellent idea restricting it to just Guorbi to find useful BLM values.
@ricschuster, I've just pushed a commit with the latest CRAN version of prioritizr and the ability to specify different BLM values in the benchmark.toml
file. What do you think? Please let me know if it's not clear how to use or if you have any follow up questions?
Awesome!
Do I need to pull the entire repo from GitHub again because of the prioritizr
update too?
Yeah, if you want to use the new version of prioritizr -- but that's probably not needed just for finding good BLM values.
Thanks!
I think I've finally figured out the benchmark parameters.
Gurobi, CPLEX, and CBC runs all completed in reasonable times today. Going to add lpsymphony
and Rsymphony
to the mix now and let things run over the weekend. Fingers crossed that this is it for benchmark runs.
Ah ok - awesome - fingers crossed!
All 480 scenarios completed running and I've created a new pre-release based on them.
I've updated the benchmark vignette with figures for min_set
and will work on min_shortfall
next.
Getting really close to have everything together for this.
Awesome - thanks! Yeah, it will be really exciting to see how the solvers compare with a different objective.
I've pushed a commit that now creates figures for both objective functions.
If you knit benchmark.Rmd
on the benchmark branch of prioritizr you can have a look.
min_shortfall results are all over the place and I'm having a hard time interpreting them.
The main takeaway for me is: CBC only performs well for min_set.
What do you think about the results?
Awesome work!! I'm just tweaking the plots to add the stuff that I said I would add earlier (e.g. automatic unit conversions). Also, when I knit the Rmd to a html, the paragraphs look a bit odd (e.g. most sentences are on a different line). I think this might be because most of the sentences start on a new line (probably to avoid lines exceeding 80 characters in length)? So, I'm also reformatting the text to avoid this issue.
Ok, I've just pushed an updated version of the benchmark vignette. I've tried to include comments to help explain what's happening - but let me know any changes are unclear?
This looks great, thanks very much for your updates!
As for the content: what are your thoughts on the min shortfall results? They are all over the place (e.g. SYMPHONY outperforming the others in several cases) that I'm wondering if there is something strange going on. The min set results generally suggest that the setup is sound, but I'm really surprised at the lack of consistency. What do you think?
Yeah, I'm surprised at the lack of consistency too. I can't think of any reason why it could be be due to a bug (e.g. now that we've standardized the timing methods), but the possibility still remains. My impression is that SYMPHONY performs better than CBC for min shortfall with no boundary penalties, and for (generally) min shortfall with low boundary penalties, and min shortfall with small to moderate numbers of planning units with high boundary penalties. So, I guess I would generally recommend SYMPHONY for min shortfall over CBC, unless it takes ages, in which case trying CBC might be worth it. How does that sound?
Yeah, I think you are right. CBC recommendation really only makes sense for min set, which isn't a bad thing and probably >90% of what people implement at the moment.
I'm kind of scared to think about CBC processing times for multi-zone problems now.
Yeah, it will be interesting to see how CBC performs for multi-zone problems. At some point in the future, it might be worth extending the benchmarks to include multiple zones so we can get a better handle on this -- but I don't think it's a priority for now?
I agree, multi zone benchmarking can wait. Lets get this vignette finished first and then promote the package updates.
I've pushed some text updates to the benchmark vignette and was wondering if you could have a look? We haven't used the raster results yet, but I'm wondering if what we have now would be sufficient already? I think the vignette gives a good sense of solver performances as is. What do you think?
I just remembered that we didn't finish this up yet. If I remember correctly, @jeffreyhanson you had offered to finish the vignette text. Is that still the plan?
Yes, that's absolutely correct! Sorry, I forgot about this. I'll finish off the text today.
I think I finished off the text, so I'll close this issue.
Thanks Jeff! Just starting this new issue here to continue #4