biosustain / cameo

cameo - computer aided metabolic engineering & optimization
http://cameo.bio
Apache License 2.0
113 stars 42 forks source link

solver(s) question #260

Open picousse opened 4 years ago

picousse commented 4 years ago

Hi, As somebody that just wants to play a bit with FBA, I'm a bit choked on what to do solver wise.

I'm now running an example using a genome scale model (linux machine)

result = optknock.run(max_knockouts=1, target="EX_ac_e", biomass="BIOMASS_Ec_iAF1260_core_59p81M")

and it is currently already running for 24h.

Does that even make sense?

I used Gurobi in my Uni time, and that was fast, but also expensive for non academic use.

What are the options for somebody who want to play with FBA now and then? Can you do something with GLPK of is that just a lost cause?

Midnighter commented 4 years ago

Hey @picousse,

You say FBA but your example shows an OptKnock run which is a mixed-integer linear programming (MILP) problem. Generally, GLPK performs very well on linear programming (LP) problems such as FBA, FVA, or linear MOMA but is very slow for MILP ones as you have discovered with OptKnock here.

For MILP, cameo's underlying solver interface optlang currently supports CPLEX and Gurobi, both of which have academic options but are otherwise pricey. There has been talk about SCIP but I wouldn't expect that in the near future.

picousse commented 4 years ago

Hi, Yes, indeed, was cutting corners in my original post.

Is there anything available in the cloud? A solver that you can use per hour or so? Gurobi has cloud, but demands already a nice investment up front?

CLPEX exists but it in the cloud it is called watson?

Neos server might be interesting as well as it seems to give acces to CPLEX and/or Gurobi (and a couple of other solvers). But I have no clue whatsover how I could use it.

So besides the answer on the more fundamental part, I'm also wondering how I could solve this technically.

Best regards, Pieter

picousse commented 4 years ago

How well does glpk scale? Would it help to run it on a HPC?

I'm running this since friday, and still going strong...

Midnighter commented 4 years ago

It's quite likely that it is stuck in the solution process unfortunately. Maybe abort the process and restart it with solver log output so that you can see what the solver is doing. To do so, you should be able to set model.solver.configuration.verbosity = 3 before running OptKnock but again, GLPK is orders of magnitude slower than commercial solvers for MILP problems. GLPK can also not use multiple threads or processes itself. Only if you somehow parallelize it on the Python side would you see any improvement.

Regarding your question for more options, maybe @phantomas1234 has an idea, but I can't think of anything else right now, sorry.

picousse commented 4 years ago

okay, Thanks for the tips and input. A bit more verbosity doesn't hurt :-)

picousse commented 4 years ago

Am I doing this right? I'm not seeing any verbose right now?

image

Midnighter commented 4 years ago

You should see the solver output in the terminal where you started the Jupyter notebook.

picousse commented 4 years ago

ach owkay. I'm running this on a jupyterlab server (over jupyterhub).

Pff solvers seems to be a pain in the ass... After 33226 lines of logs, it just seems stop, without any error:

 A: min|aij| =  5.500e-05  max|aij| =  5.998e+01  ratio =  1.091e+06
GM: min|aij| =  3.097e-02  max|aij| =  3.229e+01  ratio =  1.043e+03
EQ: min|aij| =  9.589e-04  max|aij| =  1.000e+00  ratio =  1.043e+03
Constructing initial basis...
Size of triangular part is 898
 222289: obj =   7.367009389e-02 inf =   3.159e+01 (156)
 222881: obj =   7.367009389e-02 inf =   5.430e-04 (1) 3
LP HAS NO PRIMAL FEASIBLE SOLUTION
glp_simplex: unable to recover undefined or non-optimal solution
GLPK Simplex Optimizer, v4.65
1673 rows, 3098 columns, 13088 non-zeros
 222881: obj =   7.367009389e-02 inf =   8.003e-04 (1)
LP HAS NO PRIMAL FEASIBLE SOLUTION
GLPK Simplex Optimizer, v4.65
1673 rows, 3098 columns, 13088 non-zeros
*222881: obj =   7.367009389e-02 inf =   2.980e-15 (1)
*222891: obj =   7.367009389e-01 inf =   2.672e-15 (0)
OPTIMAL LP SOLUTION FOUND
GLPK Simplex Optimizer, v4.65
1673 rows, 3098 columns, 13088 non-zeros
*222891: obj =   7.367009389e-01 inf =   2.672e-15 (0)
OPTIMAL LP SOLUTION FOUND
GLPK Simplex Optimizer, v4.65
1673 rows, 3098 columns, 13088 non-zeros
*222891: obj =   7.367009389e-01 inf =   2.672e-15 (0)
OPTIMAL LP SOLUTION FOUND
GLPK Simplex Optimizer, v4.65
1673 rows, 3098 columns, 13088 non-zeros
*222891: obj =   7.367009389e-01 inf =   2.672e-15 (0)
OPTIMAL LP SOLUTION FOUND
GLPK Simplex Optimizer, v4.65
1673 rows, 3098 columns, 13088 non-zeros
*222891: obj =   7.367009389e-01 inf =   2.672e-15 (0)
OPTIMAL LP SOLUTION FOUND

script is not finished.

no idea what to do next...