jump-dev / Dualization.jl

Automatic dualization feature for MathOptInterface.jl
Other
97 stars 6 forks source link

Bug for Dualization of Maximization Problems #142

Closed LukasBarner closed 2 years ago

LukasBarner commented 2 years ago

I think this might be a general thing, please see the example below:

using JuMP, Dualization

MaxModel = Model()

@variable(MaxModel, Q)
@objective(MaxModel, Max, -(Q^2-Q) )
@constraint(MaxModel, C, Q>=0)

DualMaxModel = dualize(MaxModel;dual_names = DualNames("dual", ""))
println(DualMaxModel)
# This gives: 
#
# Min quadslack_Q²
# Subject to
#  Q : dualC_1 + 2 quadslack_Q = -1.0
#  dualC_1 ≥ 0.0

MinModel = Model()
@variable(MinModel, Q)
@objective(MinModel, Min, (Q^2-Q) )
@constraint(MinModel, C, Q>=0)

DualMinModel = dualize(MinModel; dual_names = DualNames("dual", ""))
println(DualMinModel)
# This gives: 
#
# Max -quadslack_Q²
# Subject to
#  Q : dualC_1 - 2 quadslack_Q = -1.0
#  dualC_1 ≥ 0.0

The DualMaxModel cannot be correct; the problem also persists for other settings... This issue came up when checking for the reason behind https://github.com/joaquimg/BilevelJuMP.jl/issues/182

odow commented 2 years ago

Just to clarify, the problem is the sign switch on the dual value? The objective values coincide.

@blegat is the authority on what our dual conventions are, so I don't know whether this was intentional or not.

julia> solution_summary(MaxModel; verbose = true)
* Solver : Ipopt

* Status
  Termination status : LOCALLY_SOLVED
  Primal status      : FEASIBLE_POINT
  Dual status        : FEASIBLE_POINT
  Result count       : 1
  Has duals          : true
  Message from the solver:
  "Solve_Succeeded"

* Candidate solution
  Objective value      : 2.50000e-01
  Dual objective value : 0.00000e+00
  Primal solution :
    Q : 5.00000e-01
  Dual solution :
    C : 5.02681e-09

* Work counters
  Solve time (sec)   : 4.70781e-03

julia> solution_summary(DualMaxModel; verbose = true)
* Solver : Ipopt

* Status
  Termination status : LOCALLY_SOLVED
  Primal status      : FEASIBLE_POINT
  Dual status        : FEASIBLE_POINT
  Result count       : 1
  Has duals          : true
  Message from the solver:
  "Solve_Succeeded"

* Candidate solution
  Objective value      : 2.50000e-01
  Dual objective value : 5.00000e-01
  Primal solution :
    dualC_1 : -4.97293e-09
    quadslack_Q : -5.00000e-01
  Dual solution :
    Q : -5.00000e-01

* Work counters
  Solve time (sec)   : 2.80129e-01
julia> solution_summary(MinModel; verbose = true)
* Solver : Ipopt

* Status
  Termination status : LOCALLY_SOLVED
  Primal status      : FEASIBLE_POINT
  Dual status        : FEASIBLE_POINT
  Result count       : 1
  Has duals          : true
  Message from the solver:
  "Solve_Succeeded"

* Candidate solution
  Objective value      : -2.50000e-01
  Dual objective value : 0.00000e+00
  Primal solution :
    Q : 5.00000e-01
  Dual solution :
    C : 5.02681e-09

* Work counters
  Solve time (sec)   : 7.67708e-03

julia> solution_summary(DualMinModel; verbose = true)
* Solver : Ipopt

* Status
  Termination status : LOCALLY_SOLVED
  Primal status      : FEASIBLE_POINT
  Dual status        : FEASIBLE_POINT
  Result count       : 1
  Has duals          : true
  Message from the solver:
  "Solve_Succeeded"

* Candidate solution
  Objective value      : -2.50000e-01
  Dual objective value : -5.00000e-01
  Primal solution :
    dualC_1 : -4.97293e-09
    quadslack_Q : 5.00000e-01
  Dual solution :
    Q : -5.00000e-01

* Work counters
  Solve time (sec)   : 6.70195e-03
LukasBarner commented 2 years ago

IMO, the sign flip between quadslack_Q and Q was the problem, there are two reasons why:

  1. From a more general / definitions perspective: When giving two equivalent problems (i.e. by flipping objective sense and coefficients at the same time), users might expect to get the same dual problem/solution.
  2. From a KKT / Strong Duality perspective: Assuming that primal Q from the example above is non-negative, I would have expected the optimal quadslack_Q to also be. From the dual constraint Q, it becomes obvious that this must be the case for DualMinModel, but can never be the case for DualMaxModel.

The second reason is what initially caused me to say "the DualMaxModel cannot be correct". However, I later realized that this may be a convention/definition thing and I might be completely wrong :D I came across this problem in BilevelJuMP, which makes use of Dualization. When Dualizing a lower level maximization problem, the current behavior actually causes incorrect results (see https://github.com/joaquimg/BilevelJuMP.jl/issues/182). If the current behavior is desired, this could probably be fixed in BilevelJuMP itself (i.e. by internally converting lower level problems to Min before calling anything related to Dualization).

joaquimg commented 2 years ago

You might be correct. We don't have the Max dualization written for quadratic programs: https://jump.dev/Dualization.jl/dev/manual/#Quadratic-problems-1 We would have to write them.

Note that the sign in front of quadslack_Q in the constraints should never matter, both should work.

joaquimg commented 2 years ago

This is the first part (theory): https://github.com/jump-dev/Dualization.jl/pull/149

So, for simply dualization, the sign change of quadslack's is irrelevant, however that makes a difference in KKT conditions. So we need to flip signs of the quadslacks in maximization problems to fix BilevelJuMP.

LukasBarner commented 2 years ago

In my understanding (coming from a Lagrangian duality definition), dual constraints are used to make the minimum/infimum (or vice versa for maximization) over primal variables in the Lagrange dual function explicit. This implies that the dual constraint above should be stationarity of the Lagrangean over primal variables. For both examples above (if I am not mistaken), taking the derivative of the Lagrangean wrt Q should yield a negative sign in front of the "2 quadslack_Q". I realize that in the quadratic dual problem case, it does not really make a practical difference which sign is used. But, from a more technical perspective (and also considering that the KKTs use the same stationarity definition of the Lagrangean), doesn't the sign flip problem remain an issue? Also not sure about this, but what happens when a user is interested in the original primal values after solving the dual. Isn't the sign-flip between the dual and primal Q confusing and a potential source of error?

joaquimg commented 2 years ago

We might be talking about the same thing. I added maximization versions for duality of QPs and the KKT also. Can you review to check if you agree with: https://jump.dev/Dualization.jl/previews/PR149/manual/#Maximization-6 ? Once we are settled there we can do the respective fixes.

LukasBarner commented 2 years ago

Ah yes, that was a misunderstanding :D I'm sry, I didn't have the time yesterday to check https://github.com/jump-dev/Dualization.jl/pull/149 in detail and misinterpreted something else... We were actually talking about the same thing!

LukasBarner commented 2 years ago

About fixes: I remember from checking earlier that there is a lot of sign flipping going on. Would this be a good point to take a look at that as well?

joaquimg commented 2 years ago

Fantastic. Now we can fix.

odow commented 2 years ago

I guess this is the same as https://github.com/jump-dev/Dualization.jl/issues/70 and linked issues.

LukasBarner commented 2 years ago

Feeling too inexperienced to take a look at all of the other sign flipping, but I do have a quick fix for this: https://github.com/jump-dev/Dualization.jl/pull/150. It's not perfect, since it adds sign-flips instead of reducing them, but it should not mess anything up and at least get conformity to the docs and BilevelJuMP maximization in lower levels working...

blegat commented 2 years ago

Thanks for the fix @LukasBarner !