Slow performance instantiating large problems

MBradbury commented 5 months ago

Details for the issue

What did you do?

Creating large problems in PULP is slow, there are a two issues:

Creating expressions via pulp.lpSum is slower than directly creating them via pulp.LpAffineExpression
Making constraints requires allocating and filling an additional 2 copies of an LpAffineExpression

import pulp, numpy as np, itertools

# Some reward matrix
reward = np.full((300, 200, 50), 10)

prob = pulp.LpProblem("Problem", pulp.constants.LpMaximize)
dv = pulp.LpVariable.dicts("dv",
       itertools.product(range(300), range(200), range(50)),
       cat=pulp.constants.LpBinary)

Slow objective function:

prob += pulp.lpSum([
            dv[a, b, c] * reward[a, b, c]
            for a in range(300)
            for b in range(200)
            for c in range(50)
        ])

Faster objective function:

prob += pulp.LpAffineExpression(
            (dv[a, b, c], reward[a, b, c])
            for a in range(300)
            for b in range(200)
            for c in range(50)
        )

Slow constraint:

for c in range(50):
  prob += pulp.lpSum([
              dv[a, b, c]
              for a in range(300)
              for b in range(200)
          ]) == 1

Faster constraint:

for c in range(50):
  prob += pulp.LpAffineExpression(
              (dv[a, b, c], 1)
              for a in range(300)
              for b in range(200)
          ) == 1

Making constraints is slow as self - other creates the first LpAffineExpression copy at https://github.com/coin-or/pulp/blob/master/pulp/pulp.py#L980, https://github.com/coin-or/pulp/blob/master/pulp/pulp.py#L983 or https://github.com/coin-or/pulp/blob/master/pulp/pulp.py#L986. A second copy is then made when LpConstraint's constructor is called (https://github.com/coin-or/pulp/blob/master/pulp/pulp.py#L1012).

It seems as if one of these copies can be avoided if the rhs is set when other is int/float. So in LpAffineExpression.__eq__ this would be:

def __eq__(self, other):
        if isinstance(other, (float, int)):
            return LpConstraint(self, const.LpConstraintEQ, rhs=other)
        else:
            return LpConstraint(self - other, const.LpConstraintEQ)

Avoiding the second copy of LpAffineExpression would require that LpConstraint does not inherit from LpAffineExpression, which is a more significant change.

Some performance numbers using the attached script:

Slow Objective: 16.692 seconds
Slow Constraint: 22.000 seconds
Fast Objective: 2.945 seconds
Fast Constraint: 20.278 seconds
Faster Constraint: 15.407 seconds

Happy to send a PR with the change to use rhs if this looks correct.
Unsure about the right way forward with lpSum, as LpAffineExpression's are constructed when doing the internal multiplication, ideally something more lightweight (e.g., tuple) would be used as the result of the multiplication.

Useful extra information

The info below often helps, please fill it out if you're able to. :)

What operating system are you using?

[X] Windows: Microsoft Windows [Version 10.0.19045.4291]

I'm using python version: Python 3.10.12

I installed PuLP via:

[X] pypi (python -m pip install pulp)

Did you also

[ ] Tried out the latest github version: https://github.com/coin-or/pulp
[X] Searched for an existing similar issue: https://github.com/coin-or/pulp/issues?utf8=%E2%9C%93&q=is%3Aissue%20

pchtsp commented 5 months ago

Thanks for the issue. Indeed some improvements can be done in efficiency here.

If you have a proposal, feel free to do a PR and I'll check it.

The comments make sense I can't check if the proposal is correct. We can let the tests check that.

F.

On Fri, May 3, 2024, 11:36 Matthew Bradbury @.***> wrote:

Details for the issue What did you do?

Creating large problems in PULP is slow, there are a two issues:

Creating expressions via pulp.lpSum is slower than directly creating them via pulp.LpAffineExpression

Making constraints requires allocating and filling an additional 2 copies of an LpAffineExpression

import pulp, numpy as np, itertools

Some reward matrixreward = np.full((300, 200, 50), 10)

prob = pulp.LpProblem("Problem", pulp.constants.LpMaximize)dv = pulp.LpVariable.dicts("dv", itertools.product(range(300), range(200), range(50)), cat=pulp.constants.LpBinary)

Slow objective function:

prob += pulp.lpSum([ dv[a, b, c] * reward[a, b, c] for a in range(300) for b in range(200) for c in range(50) ])

Faster objective function:

prob += pulp.LpAffineExpression( (dv[a, b, c], reward[a, b, c]) for a in range(300) for b in range(200) for c in range(50) )

Slow constraint:

for c in range(50): prob += pulp.lpSum([ dv[a, b, c] for a in range(300) for b in range(200) ]) == 1

Faster constraint:

for c in range(50): prob += pulp.LpAffineExpression( (dv[a, b, c], 1) for a in range(300) for b in range(200) ) == 1

Making constraints is slow as self - other creates the first LpAffineExpression copy at https://github.com/coin-or/pulp/blob/master/pulp/pulp.py#L980, https://github.com/coin-or/pulp/blob/master/pulp/pulp.py#L983 or https://github.com/coin-or/pulp/blob/master/pulp/pulp.py#L986. A second copy is then made when LpConstraint's constructor is called ( https://github.com/coin-or/pulp/blob/master/pulp/pulp.py#L1012).

It seems as if one of these copies can be avoided if the rhs is set when other is int/float. So in LpAffineExpression.eq this would be:

def eq(self, other): if isinstance(other, (float, int)): return LpConstraint(self, const.LpConstraintEQ, rhs=other) else: return LpConstraint(self - other, const.LpConstraintEQ)

Avoiding the second copy of LpAffineExpression would require that LpConstraint does not inherit from LpAffineExpression, which is a more significant change.

Some performance numbers using the attached https://github.com/coin-or/pulp/files/15198221/test.txt script:

Slow Objective: 16.692 seconds Slow Constraint: 22.000 seconds Fast Objective: 2.945 seconds Fast Constraint: 20.278 seconds Faster Constraint: 15.407 seconds

Happy to send a PR with the change to use rhs if this looks correct.

Unsure about the right way forward with lpSum, as LpAffineExpression's are constructed when doing the internal multiplication, ideally something more lightweight (e.g., tuple) would be used as the result of the multiplication.

Useful extra information

The info below often helps, please fill it out if you're able to. :) What operating system are you using?

Windows: Microsoft Windows [Version 10.0.19045.4291]

I'm using python version: Python 3.10.12 I installed PuLP via:

pypi (python -m pip install pulp)

Did you also

Tried out the latest github version: https://github.com/coin-or/pulp

Searched for an existing similar issue: https://github.com/coin-or/pulp/issues?utf8=%E2%9C%93&q=is%3Aissue%20

— Reply to this email directly, view it on GitHub https://github.com/coin-or/pulp/issues/749, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJUZ42WYOW3JNOEGOKVY23ZANLB5AVCNFSM6AAAAABHFGOANWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI3TOMRZGYYDMNQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

MBradbury commented 5 months ago

Some initial work here: https://github.com/MBradbury/pulp/tree/perf

My test file now shows these results:

Slow Objective: 16.397 seconds
Slow Constraint: 7.267 seconds
Fast Objective: 2.675 seconds
Fast Constraint: 6.413 seconds

Faster Constraint is no longer present as the change has been incorporated.

No change in performance of building objectives, improvement for constraints. There are still some performance wins when using tuples over lpSum, but constraints are faster now as they avoid copying LpAffineExpression content and just use the object that was provided to them.

The tests I am able to run are fine:

Ran 1327 tests in 40.831s

OK (skipped=1264)

coin-or / pulp