plasmo-dev / Plasmo.jl

A Platform for Scalable Modeling and Optimization
Other
151 stars 21 forks source link

Link constraint creation time increases with number of subgraphs #124

Open dlcole3 opened 1 week ago

dlcole3 commented 1 week ago

I am finding that the time it takes to add new link constraints to a graph seems to increase with the number of subgraphs that exist on the graph. You can see this behavior in the MWE below, where the time printed out in the loop is monotonically increasing at each iteration. Any ideas why this might be occurring? I do not think this occurred with Plasmo v0.5

using Random
r = rand(100)
g = OptiGraph()
g1 = OptiGraph()
@optinode(g1, n)
@variable(n, 0 <= x[1:100] <= 1)
@objective(n, Min, sum(x[i] * r[i] for i in 1:100))
add_subgraph!(g, g1)

function build_subgraph(g, g1)
    new_r = rand(100)
    new_g = OptiGraph()
    @optinode(new_g, n)
    @variable(n, 0 <= x[1:100] <= 1)
    @objective(n, Min, sum(x[i] * new_r[i] for i in 1:100))

    add_subgraph!(g, new_g)
    for i in 1:100
        @linkconstraint(g, new_g[:n][:x][i] + g1[:n][:x][i] == new_r[i])
    end
end

for j in 1:20
    t = @elapsed begin
        build_subgraph(g, g1)
    end
    println(t)
end

I think this scaling is coming from the additional subgraphs, not from the additional link constraints. I tested this out too, just to ensure that it wasn't the number of link constraints that is causing the increase in solution time. I also ran the code below, and the times being printed out are staying the same in each iteration fo the loop.

function add_random_link_constraints(g, g1, g2)
    r = rand(1000)
    for i in 1:1000
        @linkconstraint(g, g1[:n][:x][1] + g2[:n][:x][1] >= - r[i])
    end
end

for j in 1:20
    g2 = getsubgraphs(g)[2]
    t = @elapsed begin
        add_random_link_constraints(g, g1, g2)
    end
    println(t)
end
dlcole3 commented 5 days ago

@jalving, I profiled some of the code above to see what was the bottleneck. You can see one of the flamegraphs below. This flamegraph is the result of profiling the build_subgraph function in the first code snippet above after running the loop of 20 iterates. If I am understanding the flamegraph correctly, I think the bottleneck is coming largely from the _add_backend_variables(::Plasmo.GraphMOIBackend, ::Vector{NodeVariableRef}) function here, and I think it is largely the setdiff call. Is there a way to avoid or speed up this call? Wondering for instance if there is a way to guarantee that the variables are added to the backend without having to check and add them during this _add_backend_variables call. Thoughts?

image