PyPSA / linopy

Linear optimization with N-D labeled arrays in Python
https://linopy.readthedocs.io
MIT License
163 stars 45 forks source link

Update JuMP and linopy benchmark scripts #163

Closed odow closed 1 year ago

odow commented 1 year ago

Hey! I've been meaning to check this project out. People keep mentioning it to me.

Your docs are very nice, and I quite like this style of modeling. I couldn't get the benchmarks to run (told me it couldn't find ortools-python and it took ages to try and find a feasible install?), but here are a couple of minor changes. (Feel free to close this PR if you would prefer to keep them as-is.)

I thought it might be helpful to explain why JuMP's memory usage is so high: we actually store three copies of the problem data.

Here's a demonstration:

using JuMP, Gurobi

function basic_model(n, solver)
    m = Model(solver)
    set_silent(m)
    N = 1:n
    M = 1:n
    @variable(m, x[N, M])
    @variable(m, y[N, M])
    @constraint(m, [i=N, j=M], x[i, j] - y[i, j] >= i-1)
    @constraint(m, [i=N, j=M], x[i, j] + y[i, j] >= 0)
    @objective(m, Min, sum(2 * x[i, j] + y[i, j] for i in N, j in M))
    optimize!(m)
    return objective_value(m)
end

function basic_model_1(n, solver)
    m = Model(solver) 
    set_silent(m)
    @variable(m, x[1:n, 1:n])
    @variable(m, y[1:n, 1:n])
    @constraint(m, x - y .>= 0:(n-1))
    @constraint(m, x + y .>= 0)
    @objective(m, Min, 2 * sum(x) + sum(y))
    optimize!(m)
    return objective_value(m)
end

function basic_model_2(n, solver)
    m = direct_model(solver())
    set_silent(m)
    @variable(m, x[1:n, 1:n])
    @variable(m, y[1:n, 1:n])
    @constraint(m, x - y .>= 0:(n-1))
    @constraint(m, x + y .>= 0)
    @objective(m, Min, 2 * sum(x) + sum(y))
    optimize!(m)
    return objective_value(m)
end

julia> GC.gc(); @time basic_model(700, Gurobi.Optimizer)
  7.008127 seconds (32.83 M allocations: 2.411 GiB, 20.01% gc time)
8.56275e7

julia> GC.gc(); @time basic_model_1(700, Gurobi.Optimizer)
  7.231301 seconds (32.83 M allocations: 2.482 GiB, 22.91% gc time)
8.56275e7

julia> GC.gc(); @time basic_model_2(700, Gurobi.Optimizer)
  5.856433 seconds (28.91 M allocations: 1.932 GiB, 19.70% gc time)
8.56275e7

Biggest change with direct_model is that memory usage drops by 20% (not the expected 1/3 because there's some other overhead, etc).

If we turn off passing variable names to Gurobi, we get more improvement:

julia> function basic_model_3(n, solver)
           m = direct_model(solver())
           set_string_names_on_creation(m, false)
           set_silent(m)
           @variable(m, x[1:n, 1:n])
           @variable(m, y[1:n, 1:n])
           @constraint(m, x - y .>= 0:(n-1))
           @constraint(m, x + y .>= 0)
           @objective(m, Min, 2 * sum(x) + sum(y))
           optimize!(m)
           return objective_value(m)
       end
basic_model_3 (generic function with 1 method)

julia> GC.gc(); @time basic_model_3(700, Gurobi.Optimizer)
  5.596101 seconds (23.03 M allocations: 1.728 GiB, 22.79% gc time)
8.56275e7

But I wouldn't expect most users to care about this.

codecov[bot] commented 1 year ago

Codecov Report

Patch and project coverage have no change.

Comparison is base (620724b) 87.30% compared to head (95f7a45) 87.30%. Report is 1 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #163 +/- ## ======================================= Coverage 87.30% 87.30% ======================================= Files 14 14 Lines 3119 3119 Branches 707 707 ======================================= Hits 2723 2723 Misses 289 289 Partials 107 107 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

FabianHofmann commented 1 year ago

Hey @odow, thank you for your PR. This improves readability and performance at the same time, nice :) JuMP is a wonderful package. Some years ago, we even tried switching to Julia for our energy modelling system in order to benefit from JuMP, but it proved too cumbersom for different reasons...

Yes, JuMP's internal memory consumption has already been mentioned by a colleague of mine, but it is still relatively low. For comparability, I would keep the variable names (the other packages also keep them). Turning on the direct model would be totally fine by me.

I'll merge this one. If you have other propositions, feel free to create a PR :)

(I think I should remove or-tools from the list, I also had huge problems to install it...)