jump-dev / Dualization.jl

Automatic dualization feature for MathOptInterface.jl
Other
95 stars 6 forks source link

SDP dualization slow, can read_from_file replace it for SDPA input? #134

Closed kocvara closed 2 years ago

kocvara commented 2 years ago

Hi,

When using "read_from_file" in, e.g.,

model = read_from_file("sdplib/thetaG11.dat-s") set_optimizer(model, Mosek.Optimizer) optimize!(model)

I get the SDP in the "dualized" form (in the notation of Yalmip), i.e., with many additional linear equality constraints and Mosek (and other codes) is very slow when solving it. The performance of Mosek improves dramatically when using "dualize", such as

model = read_from_file("sdplib/thetaG11.dat-s") dual_model = dualize(model) set_optimizer(dual_model, Mosek.Optimizer) optimize!(dual_model)

However, the "dualize(model)" command is very slow. I would suggest to have an option, when reading SDPA format by "read_from_file", to generate the dualized (for me primal) model directly; this should be straightforward, as all the data are read from the SDPA file, anyway.

Would that be worth doing? Michal

guilhermebodin commented 2 years ago

Hi @kocvara could you share the "sdplib/thetaG11.dat-s" file? We can investigate performance issues on Dualization.jl. Creating a read_from_file that dualizes the problem directly should be as fast as reading the primal and dualizing since you have to read the entire primal problem before dualizing anyways.

kocvara commented 2 years ago

It wouldn't, as the problem in the SDPA file is exactly in the "dualized" format min c^Tx s.t. \sum x_i A_i - A_0 \succeq 0
with matrices A_i and vector c written in the sparse format. It is actually strange to me, why read_from_file delivers the problem in a different format by introducing new variables and lin equality constraints. Fior instance, in Matlab/Yalmip, this file is read very quickly in the "original" (i.e., dualized by JuMP) format by [con, obj] = loadsdpafile('../../../sdplib/thetaG11.dat-s'); which uses actually a SeDuMi command.

The file is here:

https://www.dropbox.com/s/zbefx2jgcvx99mn/thetaG11.dat-s?dl=0

Thanks

kocvara commented 2 years ago

To be more specific, model = read_from_file("sdplib/thetaG11.dat-s") gives the problem in format A and dual_model = dualize(model) in format B, while Matlab/Yalmip

[con, obj] = loadsdpafile('../../../sdplib/thetaG11.dat-s');

gives format B and

[cond,objd,X,free] = dualize(con,obj)

format A, both very quickly.

guilhermebodin commented 2 years ago

Oh now I got it, You can probably open a question on julia discourse or open an issue on MathOptInterface, where they define the read from file function https://github.com/jump-dev/MathOptInterface.jl/issues.

blegat commented 2 years ago

Thanks for reporting this performance issue. Using the files from https://github.com/vsdp/SDPLIB, I get

julia> model = @time read_from_file("/home/blegat/git/SDPLIB/data/theta1.dat-s")
  0.001950 seconds (6.03 k allocations: 537.516 KiB)
A JuMP Model
Minimization problem with:
Variables: 104
Objective function type: AffExpr
`Vector{AffExpr}`-in-`MathOptInterface.PositiveSemidefiniteConeTriangle`: 1 constraint
Model mode: AUTOMATIC
CachingOptimizer state: NO_OPTIMIZER
Solver name: No optimizer attached.

julia> @time dualize(model)
  0.028494 seconds (16.73 k allocations: 14.693 MiB, 73.94% gc time)
A JuMP Model
Maximization problem with:
Variables: 1275
Objective function type: AffExpr
`AffExpr`-in-`MathOptInterface.EqualTo{Float64}`: 104 constraints
`Vector{VariableRef}`-in-`MathOptInterface.PositiveSemidefiniteConeTriangle`: 1 constraint
Model mode: AUTOMATIC
CachingOptimizer state: NO_OPTIMIZER
Solver name: No optimizer attached.
Names registered in the model:

julia> model = @time read_from_file("/home/blegat/git/SDPLIB/data/thetaG11.dat-s")
  0.020859 seconds (48.38 k allocations: 13.671 MiB)
A JuMP Model
Minimization problem with:
Variables: 2401
Objective function type: AffExpr
`Vector{AffExpr}`-in-`MathOptInterface.PositiveSemidefiniteConeTriangle`: 1 constraint
Model mode: AUTOMATIC
CachingOptimizer state: NO_OPTIMIZER
Solver name: No optimizer attached.

julia> @time dualize(model)
231.668380 seconds (4.27 M allocations: 793.687 GiB, 3.51% gc time)
A JuMP Model
Maximization problem with:
Variables: 321201
Objective function type: AffExpr
`AffExpr`-in-`MathOptInterface.EqualTo{Float64}`: 2401 constraints
`Vector{VariableRef}`-in-`MathOptInterface.PositiveSemidefiniteConeTriangle`: 1 constraint
Model mode: AUTOMATIC
CachingOptimizer state: NO_OPTIMIZER
Solver name: No optimizer attached.
Names registered in the model:

Reopening as I think we should investigate.