slow/no convergence of SCS on a tiny problem

kalmarek commented 2 years ago

Specifications

OS: julia distribution of SCS
SCS Version: 3.2.0
Compiler: gcc

Description

SCS-3.2.0 fails to converge.

How to reproduce

the attached file contains a problem that scs struggles to solve. Fixing max_iters=10_000 and grid search on acceleration_lookback = -20:1:20 and alpha = 0.1:0.05:1.99 the problem is solved successfully only 13 times.

Additional information

The correct objective value is 3825 / 4096 ≈ 0.933837890625. The problem is tiny: when solved, it's <2000 iterations, runs in 0.05s so it's easy to experiment. It's somehow difficult to explain as it is a symmetric reformulation of sum of squares problem here: https://github.com/kalmarek/SymbolicWedderburn.jl/blob/master/test/action_dihedral.jl

here is the write_data_filename: https://cloud.impan.pl/s/HeOfF5dZyG4SO3N

Output

e.g. unsuccessful solve:

------------------------------------------------------------------
           SCS v3.2.0 - Splitting Conic Solver
    (c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 12, constraints m: 17
cones:    z: primal zero / dual free vars: 6
      s: psd vars: 11, ssize: 4
settings: eps_abs: 1.0e-06, eps_rel: 1.0e-06, eps_infeas: 1.0e-07
      alpha: 1.70, scale: 1.00e-01, adaptive_scale: 1
      max_iters: 5000, normalize: 1, rho_x: 1.00e-06
      acceleration_lookback: 8, acceleration_interval: 10
lin-sys:  sparse-direct
      nnz(A): 25, nnz(P): 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 1.86e+01  1.00e+00  1.90e+01 -1.11e+01  1.00e-01  4.25e-05 
   250| 1.33e-02  1.60e-03  2.19e-02  5.87e-01  3.26e-01  2.27e-03 
   500| 9.43e-03  6.44e-04  9.17e-03  7.05e-01  3.26e-01  4.48e-03 
   750| 5.02e-02  6.61e-02  4.98e-04  9.23e-01  3.26e-01  6.75e-03  ← notice this jump in obj/gap/res
  1000| 6.27e-03  3.55e-04  4.54e-03  7.83e-01  3.26e-01  9.04e-03 
  1250| 5.43e-03  2.95e-04  3.67e-03  8.03e-01  3.26e-01  1.13e-02 
  1500| 4.81e-03  2.53e-04  3.10e-03  8.18e-01  3.26e-01  1.35e-02 
  1750| 4.32e-03  2.23e-04  2.69e-03  8.29e-01  3.26e-01  1.58e-02 
  2000| 3.93e-03  1.99e-04  2.37e-03  8.38e-01  3.26e-01  1.80e-02 
  2250| 3.61e-03  1.80e-04  2.13e-03  8.46e-01  3.26e-01  2.01e-02 
  2500| 3.34e-03  1.65e-04  1.93e-03  8.52e-01  3.26e-01  2.23e-02 
  2750| 3.12e-03  1.52e-04  1.77e-03  8.57e-01  3.26e-01  2.45e-02 
  3000| 2.07e-02  2.72e-02  1.59e-04  9.30e-01  3.26e-01  2.67e-02  ← notice this jump in obj/gap/res
  3250| 2.74e-03  1.31e-04  1.52e-03  8.66e-01  3.26e-01  2.89e-02 
  3500| 2.59e-03  1.23e-04  1.42e-03  8.70e-01  3.26e-01  3.11e-02 
  3750| 2.46e-03  1.16e-04  1.33e-03  8.73e-01  3.26e-01  3.33e-02 
  4000| 2.34e-03  1.10e-04  1.25e-03  8.76e-01  3.26e-01  3.55e-02 
  4250| 2.23e-03  1.04e-04  1.19e-03  8.79e-01  3.26e-01  3.78e-02 
  4500| 2.13e-03  9.91e-05  1.12e-03  8.81e-01  3.26e-01  4.00e-02 
  4750| 2.04e-03  9.45e-05  1.07e-03  8.83e-01  3.26e-01  4.22e-02 
  5000| 1.95e-03  9.03e-05  1.02e-03  8.85e-01  3.26e-01  4.45e-02 
------------------------------------------------------------------
status:  solved (inaccurate - reached max_iters)
timings: total: 4.46e-02s = setup: 1.27e-04s + solve: 4.45e-02s
     lin-sys: 6.74e-03s, cones: 1.53e-02s, accel: 5.34e-03s
------------------------------------------------------------------
objective = 0.885423 (inaccurate)
------------------------------------------------------------------

and a successful one:

------------------------------------------------------------------
           SCS v3.2.0 - Splitting Conic Solver
    (c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 12, constraints m: 17
cones:    z: primal zero / dual free vars: 6
      s: psd vars: 11, ssize: 4
settings: eps_abs: 1.0e-06, eps_rel: 1.0e-06, eps_infeas: 1.0e-07
      alpha: 1.95, scale: 1.00e-01, adaptive_scale: 1
      max_iters: 5000, normalize: 1, rho_x: 1.00e-06
      acceleration_lookback: -15, acceleration_interval: 10
lin-sys:  sparse-direct
      nnz(A): 25, nnz(P): 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 1.86e+01  1.00e+00  1.90e+01 -1.11e+01  1.00e-01  5.57e-05 
   250| 6.75e-03  1.28e-02  4.90e-02  5.95e-01  1.43e+00  3.09e-03 
   500| 4.02e-03  4.41e-03  1.24e-02  7.90e-01  1.43e+00  6.15e-03 
   750| 3.35e-04  1.94e-03  3.59e-08  9.34e-01  1.43e+00  9.32e-03 
  1000| 2.57e-06  8.33e-06  9.52e-06  9.34e-01  1.43e+00  1.25e-02 
  1125| 1.65e-06  1.62e-06  8.76e-07  9.34e-01  1.43e+00  1.40e-02 
------------------------------------------------------------------
status:  solved
timings: total: 1.42e-02s = setup: 1.26e-04s + solve: 1.40e-02s
     lin-sys: 1.96e-03s, cones: 5.33e-03s, accel: 1.82e-03s
------------------------------------------------------------------
objective = 0.933804
------------------------------------------------------------------

bodono commented 2 years ago

Thanks for posting this. I played around with it and I think it's just the 1/k convergence rate taking a long time to reach high accuracy, I was able to eventually converge for all settings. This is a good test case for some potential improvements.

kalmarek commented 2 years ago

ok, I thought that AA was supposed to be (guaranteed) faster than linear, but I've just fact-checked myself ;)

bodono commented 2 years ago

Unfortunately AA is not guaranteed to even provide a speedup, just that if AA is applied it will also eventually converge to the optimum (this might be out of date now, haven't quite kept up with the latest literature). However, in practice it does provide a speedup for many problems and I have added some safeguards so that it shouldn't really slow down any problem. This should translate to a net win on average.

cvxgrp / scs