ericphanson commented 4 years ago

I'm not sure if the issue is in Gurobi.jl, MOI, or EAGO, but I thought I'd ask here first.

I have a small problem in which using GLPK has no issues, but Gurobi reports NaN's. What is extra strange is that this only occurs if the problem is formulated where the objective is an @NLexpression. I have a gist with a Manifest.toml that you can use to reproduce these results:

https://gist.github.com/ericphanson/a50888d0749d3c51db112148284ff5d1

The problem is just

m = Model()
@variable(m, 0 <= v[i=1:2] <= 1)
@variable(m, 0 <= D[i=1:2,j=1:2] <= 1)
@NLexpression(m, R, sum(D[i,j] for j = 1:2, i = 1:2))
@NLexpression(m, term, sum(v[l] * R * v[k] for l = 1:2, k = 1:2))
@NLobjective(m, Min, term)

and if I use

@NLobjective(m, Min, sum(v[l] * R * v[k] for l = 1:2, k = 1:2))

instead of the @NLobjective, the error goes away, or if I use GLPK instead of Gurobi, the error goes away.

Any ideas of what could be going wrong?

odow commented 4 years ago

Presumably EAGO is solving a model, modifying the problem, then querying part of the solution. This can invalidate the cached solution inside Gurobi. GLPK is typically more tolerant of returning the solution pre modification.

mewilhel commented 4 years ago

@odow That's correct. We basically have one relaxed problem instance that we are continually modifying with an updated relaxation for each node in the branch and bound tree.

@ericphanson Thanks for the feedback. We're pushing out a fairly large update to EAGO early need week that should resolve this issue and a few other bugs we've encountered. I'll add resolving this issue to our internal checklist.

ericphanson commented 4 years ago

@mewilhel ok that's great to hear, thanks! I'm planning on doing some numerical experiments to compare some different algorithms for my problem of interest (https://github.com/ericphanson/GuessworkQuantumSideInfo.jl/pull/6) so I'll hold off until that release.

mewilhel commented 4 years ago

@ericphanson We've tagged version 0.4 which should resolve this issue (I'll be updating the documentation website today). Let us know if you have any further issues. I believe EAGO is working well with Gurobi and GLPK now. CPLEX's MOI wrapper seems a little less robust currently, so I'd avoid it for now.

Note that currently EAGO is a NLP solver (mixed-integer support coming soon), and it tends to do better when using large complex functions in the constraints and objective vs. equation oriented reformulations thereof.

We're also in the process of putting together a roadmap for this. So let us know if there is any functionality you are particularly interested in.

ericphanson commented 4 years ago

That's awesome to hear! I will try to retry my code soon.

Regarding

Note that currently EAGO is a NLP solver (mixed-integer support coming soon), and it tends to do better when using large complex functions in the constraints and objective vs. equation oriented reformulations thereof.

What do you mean by equation-oriented reformulations? My actual problem is this one:

https://github.com/ericphanson/GuessworkQuantumSideInfo.jl/blob/349a65473e72a397018c4e2d2c0af0209f1990db/src/ellipsoid/separation_oracle.jl#L131-L150

does that look like a suitable formulation?

odow commented 4 years ago

CPLEX's MOI wrapper seems a little less robust currently, so I'd avoid it for now.

Details?

mewilhel commented 4 years ago

@odow Oh, there's a potential assertion error on versus a status code return in: https://github.com/jump-dev/CPLEX.jl/issues/301.

I'm going to finish up a pull request to close this shortly.

mewilhel commented 4 years ago

@ericphanson Sorry, that terminology is out of the process-flow sheet simulation world. Basically, you can write these models as either (equation-oriented approach) a series of blocks of equations which results in a large number of simple equations and variables or (sequential modular approach) a series of blocks of expressions that must be evaluated in order which depend on a small number of variables and a few very complicated complex equations.

That model looks fine to me.

ericphanson commented 4 years ago

Ok, thanks for the clarification and for taking a look at my model! I'm coming from a very different area (quantum information theory), so it can be hard to understand.

ericphanson commented 4 years ago

Sorry for the delay, I've just gotten to look at this again. The new version indeed allows the Gurobi to be used as a solver in these examples. Thanks very much for the fix!

I've also noticed a big drop in the performance on my problem of interest with the new release of EAGO. With the Manifest from https://github.com/ericphanson/GuessworkQuantumSideInfo.jl/tree/349a65473e72a397018c4e2d2c0af0209f1990db/test/ellipsoid_tests (which has a specific commit from EAGO master that I happened to be using when I set up those tests), the tests.jl file runs in 15 minutes without an issue, but now I hit iteration limits in the problems and do not get very far through the tests.

I'll try to get a minimal example (I don't see the perf drop in the example from my first post) to isolate the issue better, but in the meantime, I was wondering if maybe there was some change in the algorithm or settings that could explain this?

edit: maybe https://github.com/PSORLab/EAGO.jl/issues/47?

mewilhel commented 4 years ago

Most of the changes should have been fairly benign (internal data structure changes and whatnot). We didn't notice a regression on any of the interval benchmarks we've run.

We did make some modifications to upper bounding parameters (we kept needing to change these to ensure the solution process was robust for research problems we were solving) as well as a few additional routines to help EAGO generate numerically safe cutting planes (#47) being one of these. It's plausible a very slightly invalid cutting plane or two as previously speeding up your solution process.

47 could be related to this, we're looking to re-enable that in a numerically safe manner in the future but it requires bridging a few theoretic gaps and a pretty extensive addition to McCormick.jl. We've got an outline of how to do this and are currently working on it but it might be a while out.

I looked at the test file but I had a little trouble figuring out what optimization problem EAGO was solving in this context (not being a quantum information person and all :) ). So how many variables, constraints, type of objective, what nonlinear expressions you are encountering, and any other info on the problem form would be helpful.

ericphanson commented 4 years ago

I looked at the test file but I had a little trouble figuring out what optimization problem EAGO was solving in this context (not being a quantum information person and all :) ). So how many variables, constraints, type of objective, what nonlinear expressions you are encountering, and any other info on the problem form would be helpful.

Yep, sorry! The info I gave wasn't super helpful. Here is the problem with some sample inputs:

using JuMP, EAGO, LinearAlgebra, Test

# Formulate the problem
function make_model(p, ρBs, Y; nl_solver)
    dB = size(first(ρBs), 1)
    J = length(p)
    c = 1:J
    # JuMP can only handle real numbers,
    # so we split into real and imaginary parts.
    A_re = real.(ρBs) .* p
    A_im = imag.(ρBs) .* p
    Y_re = real.(Y)
    Y_im = imag.(Y)

    m = Model(nl_solver)
    # `D` is doubly stochastic:
    @variable(m, 0 <= D[i=1:J,j=1:J] <= 1)
    @constraint(m, sum(D, dims = 1) .== 1)
    @constraint(m, sum(D, dims = 2) .== 1)

    # `ψ` is any vector; the magnitudes do not play a role for whether or not the objective is
    # non-negative, so we constraint them which makes the branch-and-bound solver much faster.
    @variable(m, -1 <= ψ_re[i=1:dB] <= 1)
    @variable(m, -1 <= ψ_im[i=1:dB] <= 1)

    # R = (D*c)[i]*p[i]*ρBs[i]
    @NLexpression(m, R_re[l=1:dB,k=1:dB], sum(c[j]*A_re[i][l,k]*D[i,j] for j = 1:J, i = 1:J))
    @NLexpression(m, R_im[l=1:dB,k=1:dB], sum(c[j]*A_im[i][l,k]*D[i,j] for j = 1:J, i = 1:J))

    # t1 = <ψ, R ψ>, exploiting that this quantity is real
    @NLexpression(m, t1, sum(ψ_re[l] * R_re[l,k] * ψ_re[k] - ψ_re[l] * R_im[l,k] * ψ_im[k] + ψ_im[l]*R_im[l,k] *ψ_re[k] + ψ_im[l] * R_re[l,k]*ψ_im[k] for l = 1:dB, k = 1:dB))
    # t2 = <ψ, Y ψ>, exploiting that this quantity is real
    @NLexpression(m, t2, sum(ψ_re[l] * Y_re[l,k] * ψ_re[k] - ψ_re[l] * Y_im[l,k] * ψ_im[k] + ψ_im[l]*Y_im[l,k] *ψ_re[k] + ψ_im[l] * Y_re[l,k]*ψ_im[k] for l = 1:dB, k = 1:dB))
    @NLobjective(m, Min, t1 - t2)
    return m
end

# Input for the problem
p = ones(2)/2
Y = I(2)/2
ρBs = Array{Complex{Float64},2}[[0.5451788350595113 + 0.0im -0.01424959098834434 - 0.07145352881451071im; -0.01424959098834434 + 0.07145352881451071im 0.4548211649404892 + 0.0im], [0.4903245651329952 + 0.0im -0.24598228000600747 + 0.035150681823175274im; -0.24598228000600747 - 0.035150681823175274im 0.5096754348670048 + 0.0im]]

# Warmup
m = make_model(p, ρBs, Y; nl_solver = EAGO.Optimizer)
JuMP.optimize!(m)

# Solve
m = make_model(p, ρBs, Y; nl_solver = EAGO.Optimizer)
t = @elapsed JuMP.optimize!(m)
status = JuMP.termination_status(m)
optval = JuMP.objective_value(m)
@show (t, status, optval)

With EAGO 0.4.1 and JuMP 0.21.3, I get (t, status, optval) = (249.4802547, MathOptInterface.OPTIMAL, -0.013152295940780717), with the last iteration printed being iteration number 2139000.

With EAGO on the commit https://github.com/PSORLab/EAGO.jl/commit/0cf7d15a37ef2edf4c46a16e4db28532fa2e2c9c and JuMP 0.20.1 (and using with_optimizer(EAGO.Optimizer) instead of EAGO.Optimizer to take into account changes in JuMP syntax), I get (t, status, optval) = (0.1191552, MathOptInterface.OPTIMAL, -0.006580550649928066).

So something between these has caused it to dramatically increase the number of iterations required between those commits, and also change the answer. So that would back up your hypothesis that an erroneous cut was speeding up the solution. If that's the case, then I guess there isn't much to do!

That would be a bit disappointing, since I had hoped to be able to solve this problem fast enough to make the larger algorithm solvable in a reasonable amount of time (in a case where p and ρBs are of length 16, and Y is 4 by 4), and it had seemed like that was the case on commit 0cf7d15a37ef2edf4c46a16e4db28532fa2e2c9c (I could solve my problem of interest in 6 hours there). To be fair, Alpine and SCIP are also both slow to solve the problem. One advantage is that I only need to know if the solution is non-negative or not (and if it is negative, to get the resulting D which I use to generate a cut for the larger problem this is a part of).

mewilhel commented 4 years ago

@ericphanson Thanks for the clarification and apologies for the performance drop. The approach we had for reverse propagating relaxations worked well on a number on a number of classes of problems but we really can't distinguish apriori when correctly rounding the terms is necessary. So we wanted to revert this before a lot of user's arrived a questionable results and then roll that feature out again once we've resolved those issues.

Most of the cases we're looking at with EAGO have < 25 variables participating in nonlinear expressions (although those expressions often have a very large number of nonlinear terms). For larger flatter models, like yours we don't really get much out of the McCormick operator overloading cleverness. I would generally expect (Alpine, BARON, and ANTIGONE) to outperform EAGO on this class of problem. Each of those software packages has fairly sophisticated approaches to dealing with trilinear monomial terms which seem to be the main nonlinearity in your model. We're working on updating our handling of this particular nonlinearity but it's also still a work in progress.

ericphanson commented 4 years ago

@mewilhel Thanks for the info! That makes sense. I was surprised at how well it went originally, which made this technique (solving a global problem to check feasibility in an ellipsoid method) doable and better than other methods I've tried for solving this problem. I don't have a license for BARON or ANTIGONE, and Alpine is unfortunately too slow. That's ok though, I suspect this is a hard problem and anyway I don't need the solutions for any particular use!

Thanks for all the help! I'll close this now since the original issue is addressed.

PSORLab / EAGO.jl

Strange error with Gurobi and `@NLexpression`s #44

47 could be related to this, we're looking to re-enable that in a numerically safe manner in the future but it requires bridging a few theoretic gaps and a pretty extensive addition to McCormick.jl. We've got an outline of how to do this and are currently working on it but it might be a while out.