COBREXA / COBREXA.jl

COnstraint Based Reconstruction and EXascale Analysis (in Julia)
https://cobrexa.github.io/COBREXA.jl/stable/
Apache License 2.0
9 stars 0 forks source link

Switch all documentation over from GLPK to HiGHs #57

Closed oxinabox closed 1 month ago

oxinabox commented 1 month ago

Target functionality

The docs on occasion reference using GLPK. Use of GLPK should generally be discouraged at this point.

To quote the JuMP issue (they were talking about MILP but point stands for LP as well) https://github.com/jump-dev/JuMP.jl/issues/2878

... examples in the JuMP documentation use GLPK... However, it has the down-side of being the first solver that new users see when they read the docs.

We should switch ... examples to HiGHS because it's faster, more robust, and being actively developed.

(Emphasis mine)

To make kinda clear why consider FBA on yeast. (I can make this a MWE if wanted but I don't think needed, its on this SBML) Second runtime on each, so it is all compiled

julia> @time sol_glpk = flux_balance_analysis(model, optimizer = GLPK.Optimizer);
  1.387430 seconds (838.36 k allocations: 49.660 MiB)
julia> @time sol_clp = flux_balance_analysis(model, optimizer = Clp.Optimizer);
  0.405093 seconds (805.17 k allocations: 49.605 MiB)
julia> @time sol_highs = flux_balance_analysis(model, optimizer = HiGHS.Optimizer);
  0.154727 seconds (827.32 k allocations: 50.077 MiB)

Desired output

The docs should reference using HiGHs instead.

Optional: Suggestions for implementation

Basically just change every reference to GLPK found here to be HiGHs.

stelmo commented 1 month ago

Hi there! I had no idea HiGHs was so much better! Thanks for bringing this to our attention. We will change our docs to reflect this :)

exaexa commented 1 month ago

Hi! Certainly thanks for bringing this up, but why exactly HiGHS? GLPK is used here on purpose not because it's fastest or most featureful, but because it's the least surprising one (if it fails, it fails predictably, and every other metabolic modelling package uses it as default as well). Requirement for picking the right solver for "real stuff" is known pretty well in the community, and in the end everyone is going to go with their fav Gurobi anyway :D

If there is no other benefit than speed we might want to go for something simpler instead (we already used Tulip at some point, right?). Other choices (possibly valid also for JuMP) include SCIP and ECOS. Anyway this is nothing against HiGHS, more like just checking if the choice isn't wider.

exaexa commented 1 month ago

aaaanyway, fished this out of history:

fba-all.pdf

yeah let's do highs I guess

oxinabox commented 1 month ago

I agree everyone in academia will use Gurobi, but if you can't get an academic license Gurobi is prohibitively expensive. And if you are just getting into the area you might not a) know not all solvers are equal, b) might not have a Gurobi liscense.

Its not just speed HiGHs is:

faster, more robust, and being actively developed.

A lot of the older open source solvers are largely unmaintained: its a problem (its a problem for all open source but especially for solvers). Fortunately they mostly just keep working. @odow is the person to talk to about what's going on in that world. In general, my feeling is HiGHs is now just the generally most sensibly solver to use for linear problems.

odow commented 1 month ago

There was essentially never a good reason to use GLPK. The only historical reason we used it in the JuMP docs was because it supported callbacks, and we wanted to minimize the number of unique solvers in the documentation (because installing and (pre)compilation was getting pretty expensive).

If you want an open-source (MI)LP solver, use HiGHS. Don't look anywhere else, and don't even really bother with benchmarks. The "being actively developed" outweighs everything else.

@jajhall: here's another use-case for you: "exa-scale metabolic modeling"

jajhall commented 1 month ago

Thanks @odow for enlightening another community. Also, COBREXA people, Oscar is right in referring to it as HiGHS, not HiGHs.

exaexa commented 1 month ago

I tried that in #60, looks like there's some small issues to be clarified, but could work.

Its not just speed HiGHs is:

faster, more robust, and being actively developed.

That's a pretty strong claim, citation needed. (Does HiGHS have any other research/proofs out than the one referred from the website?)

I've got the same feeling that we should just go for HiGHS if it's feasible here (from #60 it might very well be), but I'd much rather have a proof in hand.

odow commented 1 month ago

That's a pretty strong claim, citation needed

https://plato.asu.edu/ftp/lpopt.html https://plato.asu.edu/ftp/milp.html

Note that GLPK is not on these benchmarks because it is not competitive, but HiGHS is the fastest and most robust open-source (MI)LP solver.

https://github.com/ERGO-Code/HiGHS/graphs/contributors

Without speaking for @jajhall, I would treat any instance in which GLPK is faster or more robust as a bug in HiGHS.

jajhall commented 1 month ago

Flipping the argument, I don't know how anyone would justify using GLPK over HiGHS.

@odow's observations on the performance of GLPK are quantified here, where it solves 23/240 MIP problems, compared with HiGHS' 159/240 in the current benchmarks. GLPK hasn't been developed in the meantime.

There's no "proof" to justify the claims: discrimination between optimization solvers is established on performance. That said, GLP for LP is intrinsically slower since it doesn't exploit hyper-sparsity. This is one of three papers underpinning HiGHS that won the journal's best paper award for that year, if you're looking for academic distinction.

exaexa commented 1 month ago

Yeah. The flipped argument works for me too, except for the "minimal surprise" rule (which gets interesting in this community :D )

Let me work out what's wrong in #60 tests, other than that all good

exaexa commented 1 month ago

Hi all, the switch has just been finished, it should get deployed to the stable docs in next release. Thanks all for the info!

One more thing (just curious™): @oxinabox, how come you (as JuMP devs I guess) noticed this? You are using cobrexa for something, or just went through all JuMP packages and scanned for GLPK use?

odow commented 1 month ago

One more thing (just curious™): @oxinabox, how come you (as JuMP devs I guess) noticed this?

@oxinabox is not a JuMP developer, but I think in this case, is an actual user of COBREXA :smile:

oxinabox commented 1 month ago

I am not a JuMP developer, but I am a long time user of JuMP. And a very new (and confused) user of COBREXA. my background isn't in systems bio (thus very confused, all the time).

exaexa commented 1 month ago

@oxinabox ah ok good to know. Pls feel free to ask on julia slack, we kinda concentrate in #sbml nowadays