Closed jachymb closed 2 years ago
Possibly related to #146
Hi,
I agree that this needs to be optimized, but remember that the model is created in an interpreted, dynamic language and it is solved using highly optimized native machine code.
If I remember correctly, the culprit here is
Xsum
LinExpr.add_term
More specifically lines
if isinstance(term, Var):
self.add_var(term, coeff)
elif isinstance(term, LinExpr):
self.add_expr(term, coeff)
elif isinstance(term, numbers.Real):
self.add_const(term)
else:
in entities.py 280
This is slow since before adding each term to the LinExpr it needs to check its type, it can be a constant, a variable, another LinExpr… This makes easy to build models but also makes it slow.
A faster approach would be to have a more “low level” way of creating constraints, i.e. using constraint indexes and coefficients both as numpy arrays (or cffi arrays). The code would be uglier and more verbose (like C code) but definitely much faster.
Cheers
From: Jáchym Barvínek notifications@github.com Reply-To: coin-or/python-mip reply@reply.github.com Date: Sunday, November 22, 2020 at 3:42 PM To: coin-or/python-mip python-mip@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [coin-or/python-mip] LinExpr.add_var / LinExpr.add_expr / LinExpr.add_term are way too slow (#163)
Consider the following function. It generates and solves a model structurally somewhat similar to a practical problem I am solving. (I tried to simplify it as much as possible)
def experiment_mip(n): start = time.time()
model = mip.Model(sense=mip.MAXIMIZE, solver_name=mip.CBC) obj = [] for i in range(n): var = model.addvar(f"var{i}", lb=0.5, ub=1.0) a = random.random() - 0.5 # Just a dummy value for illustration obj.append(var * a) constraint = mip.xsum(obj) <= 1.0 model.addconstr(constraint, f"constraint{i}") model.objective = mip.xsum(obj)
construction_time = time.time() - start start = time.time()
model.optimize()
solution_time = time.time() - start return {"Model construction time": construction_time, "Solution time": solution_time} Now, try running this for example for n=4000. The solution is much faster than the construction of the model. My measurements suggest that the solution time is asymptotically O(n^2) complexity but the construction is more than O(n^3). The profiler shows that that most time is spent in the LinExpr.add_var, LinExpr.add_expr and LinExpr.add_term functions. I think this is not acceptable and it should never take more time to construct than to solve a nontrivial problem.
I am not sure what the solution could be.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
I guess that in many situations, the modeler would input constraints in sum notation, knowing that no variable appears twice, and every term is of the form coefficient * variable
. In this case, it would make sense to offer an alternative constructor to LinExpr
that takes in a dict
or a list
of tuples.
This would not provide nice-looking algebraic syntax, but it avoids many dictionary operations as well as the isinstance
calls.
With these minor changes (see ###
), on my laptop construction time is 1/3 of solution time for n = 4000
def experiment_mip(n):
start = time.time()
model = mip.Model(sense=mip.MAXIMIZE, solver_name=mip.CBC)
obj = LinExpr() ### do not build array of LinExpr but directly append to LinExpr
for i in range(n):
var = model.add_var(f"var_{i}", lb=0.5, ub=1.0)
a = random.random() - 0.5 # Just a dummy value for illustration
obj.add_var(var, a) ### do not create intermediate LinExpr
constraint = obj <= 1.0 ### no need to mip.xsum as obj is already the sum
model.add_constr(constraint, f"constraint_{i}")
model.objective = obj
construction_time = time.time() - start
start = time.time()
model.optimize()
solution_time = time.time() - start
return {"Model construction time": construction_time, "Solution time": solution_time}
Super interesting ! That's a pretty nice solution, I'll try to find some time to hack this into the package.
Nice, @christian2022!
I like this code and that's why we reintroduced inplace operations for LinExpr
, so that you could also write obj += a * var
(of course this would create an intermediate LinExpr).
Anyhow, I agree that it would be useful to have something like an alternative LinExpr constructor that takes a dict
or a list
of tuples.
I've just included a test considering 5 variations of the code @jachymb and @christian2022 posted here. The test example is available in this link and here are the 'construction' runtimes I got with n=10000:
using_addvar: 9.426069974899292
using_inplace_op: 9.878501176834106
using_dict: 11.112802028656006
using_lists: 14.583881855010986
recreating_linexpr: 74.6143479347229
@rschwarz: you can now create a LinExpr from a dictionary or two lists (variables and coeffs).
Consider the following function. It generates and solves a model structurally somewhat similar to a practical problem I am solving. (I tried to simplify it as much as possible)
Now, try running this for example for
n=4000
. The solution is much faster than the construction of the model. My measurements suggest that the solution time is asymptoticallyO(n^2)
complexity but the construction is more thanO(n^3)
. The profiler shows that that most time is spent in theLinExpr.add_var
,LinExpr.add_expr
andLinExpr.add_term
functions. I think this is not acceptable and it should never take more time to construct than to solve a nontrivial problem.I am not sure what the solution could be.