jump-dev / Convex.jl

A Julia package for disciplined convex programming
https://jump.dev/Convex.jl/stable/
Other
569 stars 123 forks source link

wrong conversion to conic form of LASSO #219

Closed nantonel closed 4 years ago

nantonel commented 6 years ago

Hi all,

I was testing a simple LASSO problem with Convex.jl. I also wanted to see the difference between cvxpy.

This is the script I've worked with:

using Mosek
using SCS

srand(123)
println(" \n \n \n solving Convex \n \n \n")
n,m = 100, 10
h = randn(n)

x0 = Variable(m)

x = full(sprandn(m, 0.05))
y = conv(h,x)+randn(n+m-1)

slv = SCSSolver()
#slv = MosekSolver()
lambda = 0.01

problem = minimize(0.5*norm(conv(h,x0)-y)^2+lambda*norm(x0,1))
solve!(problem, slv)

xConvex = x0.value

println(" \n \n \n solving cvxpy \n \n \n")
using PyCall
@pyimport cvxpy as cvx

slv = cvx.SCS
x0 = cvx.Variable(m)

problem = cvx.Problem(cvx.Minimize(cvx.sum_squares(cvx.conv(h,x0)-y)*0.5+cvx.norm1(x0)*lambda))
problem[:solve](solver = slv, verbose = true)

xcvxpy = x0[:value]

norm(xcvxpy-xConvex)

I'm using the very same version of SCS (the iterative version has been updated very recently on the master of SCS.jl) and Convex.jl master as well.

this is the REPL out for

  1. Convex with SCS
    
    ----------------------------------------------------------------------------
    SCS v2.0.2 - Splitting Conic Solver
    (c) Brendan O'Donoghue, Stanford University, 2012-2017
    ----------------------------------------------------------------------------
    Lin-sys: sparse-indirect, nnz in A = 1056, CG tol ~ 1/iter^(2.00)
    eps = 1.00e-05, alpha = 1.50, max_iters = 5000, normalize = 1, scale = 1.00
    acceleration_lookback = 20, rho_x = 1.00e-03
    Variables n = 23, constraints m = 135
    Cones:  primal zero / dual free vars: 1
    linear vars: 21
    soc vars: 113, soc blks: 2
    Setup time: 1.21e-04s
    ----------------------------------------------------------------------------
    Iter | pri res | dua res | rel gap | pri obj | dua obj | kap/tau | time (s)
    ----------------------------------------------------------------------------
     0| 2.05e+00  1.94e+00  9.55e-01 -7.96e+00  1.32e+01  6.54e-15  7.88e-05 
    100| 5.19e-03  1.55e-02  5.14e-06  4.88e+01  4.88e+01  2.57e-16  5.47e-03 
    200| 1.89e+00  9.93e-01  1.06e-01  1.38e+02  1.71e+02  3.22e+01  1.35e-02 
    ...
    4900| 1.22e-01  2.76e-01  2.52e-03  1.74e+03  1.73e+03  8.95e+00  3.76e-01 
    5000| 2.05e+00  2.54e+00  2.10e-04  4.03e+03  4.03e+03  6.00e+00  3.81e-01 
    ----------------------------------------------------------------------------
    Status: Infeasible/Inaccurate
    Hit max_iters, solution may be inaccurate
    Timing: Solve time: 3.81e-01s
    Lin-sys: avg # CG iterations: 8.22, avg solve time: 2.88e-05s
    Cones: avg projection time: 4.07e-07s
    Acceleration: avg step time: 4.32e-05s
    ----------------------------------------------------------------------------
    Certificate of primal infeasibility:
    dist(y, K*) = 6.1243e-03
    |A'y|_2 * |b|_2 = 2.0970e-02
    b'y = -1.0000
    ============================================================================
    WARNING: Problem status UnknownError; solution may be inaccurate.

2. `cvxpy` with `SCS`
```julia
----------------------------------------------------------------------------
    SCS v2.0.2 - Splitting Conic Solver
    (c) Brendan O'Donoghue, Stanford University, 2012-2017
----------------------------------------------------------------------------
Lin-sys: sparse-indirect, nnz in A = 1042, CG tol ~ 1/iter^(2.00)
eps = 1.00e-05, alpha = 1.50, max_iters = 5000, normalize = 1, scale = 1.00
acceleration_lookback = 20, rho_x = 1.00e-03
Variables n = 21, constraints m = 131
Cones:  linear vars: 20
    soc vars: 111, soc blks: 1
Setup time: 1.38e-04s
----------------------------------------------------------------------------
 Iter | pri res | dua res | rel gap | pri obj | dua obj | kap/tau | time (s)
----------------------------------------------------------------------------
     0| 1.87e+00  2.45e+00  9.78e-01 -1.18e+01  3.33e+01  1.13e-14  1.05e-01 
   100| 8.12e-03  8.19e-02  1.71e-04  4.93e+01  4.93e+01  6.23e-15  7.43e+00 
 ...
   440| 3.21e-07  2.33e-06  3.02e-08  4.89e+01  4.89e+01  1.08e-15  3.82e+01 
----------------------------------------------------------------------------
Status: Solved
Timing: Solve time: 3.82e+01s
    Lin-sys: avg # CG iterations: 4.59, avg solve time: 8.35e-02s
    Cones: avg projection time: 1.23e-05s
    Acceleration: avg step time: 2.49e-03s
----------------------------------------------------------------------------
Error metrics:
dist(s, K) = 1.5520e-03, dist(y, K*) = 0.0000e+00, s'y/|s||y| = 1.0961e-08
primal res: |Ax + s - b|_2 / (1 + |b|_2) = 3.2062e-07
dual res:   |A'y + c|_2 / (1 + |c|_2) = 2.3302e-06
rel gap:    |c'x + b'y| / (1 + |c'x| + |b'y|) = 3.0175e-08
----------------------------------------------------------------------------
c'x = 48.8983, -b'y = 48.8983
============================================================================
  1. Convex.jl + Mosek.jl
    
    Problem
    Name                   :                 
    Objective sense        : min             
    Type                   : CONIC (conic optimization problem)
    Constraints            : 135             
    Cones                  : 2               
    Scalar variables       : 136             
    Matrix variables       : 0               
    Integer variables      : 0               

Optimizer started. Presolve started. Linear dependency checker started. Linear dependency checker terminated. Eliminator - tries : 0 time : 0.00
Lin. dep. - tries : 1 time : 0.00
Lin. dep. - number : 0
Presolve terminated. Time: 0.00
Problem Name :
Objective sense : min
Type : CONIC (conic optimization problem) Constraints : 135
Cones : 2
Scalar variables : 136
Matrix variables : 0
Integer variables : 0

Optimizer - threads : 4
Optimizer - solved problem : the dual
Optimizer - Constraints : 22 Optimizer - Cones : 2 Optimizer - Scalar variables : 133 conic : 113
Optimizer - Semi-definite variables: 0 scalarized : 0
Factor - setup time : 0.00 dense det. time : 0.00
Factor - ML order time : 0.00 GP order time : 0.00
Factor - nonzeros before factor : 88 after factor : 88
Factor - dense dim. : 0 flops : 1.15e+04
ITE PFEAS DFEAS GFEAS PRSTATUS POBJ DOBJ MU TIME
0 2.9e+00 1.0e+00 1.5e+00 0.00e+00 0.000000000e+00 -2.000000000e+00 1.0e+00 0.00
1 4.1e-01 1.4e-01 9.2e-02 -8.75e-01 3.890825350e+00 2.064259366e+01 1.4e-01 0.00
2 9.3e-02 3.3e-02 1.4e-02 -6.96e-01 1.882791814e+01 6.375251688e+01 3.3e-02 0.00
3 2.2e-02 7.6e-03 5.3e-03 1.99e-02 4.514846416e+01 6.260221610e+01 7.6e-03 0.00
4 4.2e-03 1.5e-03 4.9e-03 1.07e+00 5.175112177e+01 5.236274995e+01 1.5e-03 0.00
5 1.8e-03 6.3e-04 6.2e-03 2.58e+00 4.967753607e+01 4.973378283e+01 6.3e-04 0.00
6 7.3e-04 2.6e-04 3.5e-03 3.77e+00 4.893193157e+01 4.897447398e+01 2.6e-04 0.00
7 9.7e-05 3.4e-05 1.4e-03 1.25e+00 4.889900446e+01 4.890362117e+01 3.4e-05 0.00
8 1.5e-05 5.4e-06 5.9e-04 1.03e+00 4.889769587e+01 4.889833210e+01 5.4e-06 0.00
9 6.0e-07 2.1e-07 1.4e-04 1.00e+00 4.889833806e+01 4.889835487e+01 2.1e-07 0.00
10 4.1e-10 4.7e-11 4.0e-06 1.00e+00 4.889832083e+01 4.889832084e+01 4.7e-11 0.00
Optimizer terminated. Time: 0.00

Interior-point solution summary Problem status : PRIMAL_AND_DUAL_FEASIBLE Solution status : OPTIMAL Primal. obj: 4.8898320835e+01 nrm: 1e+02 Viol. con: 2e-16 var: 0e+00 cones: 1e-09
Dual. obj: 4.8898320837e+01 nrm: 2e+01 Viol. con: 0e+00 var: 2e-10 cones: 0e+00


the solution of `Convex.jl` + `SCS.jl` does not coincide with the one of `Convex.jl` + `Mosek.jl` and `cvxpy`+`SCS` (which do agree). it is actually a vector of `NaN`. 

this doesn't happen for larger values of `lambda`, e.g. with `lambda = 0.1`.

not really sure of what is happening, could it be a problem of `SCS.jl`? 

I didn't open the issue there since I found it odd that `cvxpy` and `Convex.jl` give different numbers of constraints and variables as you can see from the `REPL` out. So I though this might be an issue when converting the problem to standard form...but I might be completely wrong! 

Thanks for your help,
Niccolo' 
nantonel commented 6 years ago

I have an update. I finally used JuMP by directly specifying the conic formulation of the LASSO that can be found in this paper.

The following script can be added to the one above:

println(" \n \n \n solving JuMP \n \n \n")
using JuMP

A = hcat([[zeros(i);h;zeros(m-1-i)] for i = 0:m-1]...) # Full Matrix

M = Model(solver = SCSSolver())
@variables M begin
        x[1:m]
        t[1:m]
        w
end
@objective(M,Min,0.5*w+lambda*ones(m)'*t)
@constraint(M, soc, norm( [1-w;2*(A*x-y)] ) <= 1+w)
@constraint(M,  x .<= t)
@constraint(M, -t .<= x)

solve(M)
xJuMP = getvalue(x)
norm(xcvxpy-xJuMP)

to verify that the very same iterations are performed by SCS using either cvxpy or JuMP. This doesn't happen in Convex.jl.

I think this confirms my previous guess: it seems Convex.jl converts the LASSO to the wrong conic formulation.

ericphanson commented 5 years ago

If I update the code to run on Julia 1.1 and the Convex.jl 0.12.0 (gist), then with the current release of SCS, Convex warns me that there was an "UnknownError" and I can see from the SCS logs it reached the maximum number of iterations, just as in your first post. If I increase the iterations a lot, to 500,000, then it does solve the problem and obtains the same solution as cvxpy with SCS and with Convex.jl + Mosek.

Does that really mean Convex is converting the problem to the wrong conic formulation, as opposed to some correct but inefficient formulation? The first seems much worse to me, although I agree the second is worth looking into as well.

ericphanson commented 4 years ago

Running the gist on Julia 1.3, Convex 0.13.2 and SCS.jl v0.6.6, I find SCS, Mosek, and cvxpy all give the same answer, and that SCS doesn't need extra iterations etc (the default parameters suffice).