coin-or / python-mip

Python-MIP: collection of Python tools for the modeling and solution of Mixed-Integer Linear programs
Eclipse Public License 2.0
513 stars 88 forks source link

Highs with python-mip is slow #372

Open colonetti opened 4 months ago

colonetti commented 4 months ago

Hello, everyone

Firstly, thanks for the great package and also for adding support for Highs.

However, it seems that creating a optimization model with Highs is rather slow.

Describe the bug Adding variables to a Highs optimization model is slow.

To Reproduce

import sys
import mip as pm
from timeit import default_timer as dt

def main(solver):
    m = pm.Model(solver_name=solver)
    ini = dt()
    x = {i: m.add_var() for i in range(10000)}
    print(f"{dt() - ini} seconds to add vars", flush=True)

if __name__ == '__main__':
    main(sys.argv[1])

Expected behavior GRB: 0.021987227955833077 seconds to add vars CBC: 0.026054809102788568 seconds to add vars Highs: 6.408406416885555 seconds to add vars

Desktop (please complete the following information):

Thanks again!

rschwarz commented 4 months ago

I have not looked into the code in a long while, but since the above call to add_var passes neither a name nor a variable type (integrality), the python-mip wrapper doesn't do much beyond calling Highs_addCol in a loop (see https://github.com/coin-or/python-mip/blob/master/mip/highs.py#L809). I'm not sure why this takes a lot of time, or how this could be avoided, because models in python-mip are created incrementally, not in bulk (vectorized), right? :-/

EDIT: Maybe the call to Highs_getNumCol is expensive and we could cache this count on the Python side. See also the implementation in the Julia/MOI wrapper for comparison: https://github.com/jump-dev/HiGHS.jl/blob/master/src/MOI_wrapper.jl#L738

colonetti commented 4 months ago

I have not looked into the code in a long while, but since the above call to add_var passes neither a name nor a variable type (integrality), the python-mip wrapper doesn't do much beyond calling Highs_addCol in a loop (see https://github.com/coin-or/python-mip/blob/master/mip/highs.py#L809). I'm not sure why this takes a lot of time, or how this could be avoided, because models in python-mip are created incrementally, not in bulk (vectorized), right? :-/

EDIT: Maybe the call to Highs_getNumCol is expensive and we could cache this count on the Python side. See also the implementation in the Julia/MOI wrapper for comparison: https://github.com/jump-dev/HiGHS.jl/blob/master/src/MOI_wrapper.jl#L738

Thanks, @rschwarz. I'll take a look into it.

metab0t commented 1 month ago

@rschwarz

The solution is to avoid using Highs_passColName and Highs_passRowName. They are quite expensive and should not be called for each variable/column.

The name of variable is set automatically at https://github.com/coin-or/python-mip/blob/0ccb81115543e737ab74a4f1309891ce5650c8d5/mip/lists.py#L40 although the user does not specify the name.

I met the same problem in PyOptInterface and decide to maintain the name mapping internally: https://github.com/metab0t/PyOptInterface/commit/0074dc0af1be7071d98753bf35538f08d5f1dd52

Update: A fix is proposed at https://github.com/ERGO-Code/HiGHS/pull/1782 to solve this problem.