Increased Solve Times for Sequence of Problems

kmpape commented 3 years ago

Hi there,

I am trying to solve a sequence of convex SDPs: where The matrix $\hat{R}$ is given and ill-conditioned.

Using SCS, I repeatedly solve the conic optimization problem and update $A,b$. For my particular problem, I have implemented all of this in C and compiled the program for a cluster using BLAS and CUDA.

The problem I am facing are increased solve times per SCS iteration as the above iteration goes on (log below) and I am struggling to understand why. If the number of non-zero elements in $A$ and $b$ does not change, what could cause these increased solve times per SCS iteration? In your paper, I see that the final algorithm (17) includes a projection $\Pi_\mathcal{C}$ - is that an iterative algorithm?

Any ideas are appreciated. Thank you very much.

Note: The attached log is for a run without warm-starting, but it turns out to be even worse with warm-starting. run.log

kalmarek commented 3 years ago

What takes more time in comparison is the linear solver:

Lin-sys: avg # CG iterations: 478.70, avg solve time: 1.31e+01s
...
Lin-sys: avg # CG iterations: 3325.51, avg solve time: 9.12e+01s

You may try sparse-direct solver (which will not run on a GPU). It factorizes the matrix once (which may take significant time), but then there shouldn't be a variable cost per iteration.

kmpape commented 3 years ago

Thank you @kalmarek, will try that.

kmpape commented 3 years ago

Thanks again for your hint @kalmarek. The per-iteration time is fine now; it takes 23 hrs to factor the matrix though. I believe that you are using QDLDL? Do you have any suggestions to speed that up or replace it with another routine? The problem is that I will need to repeatedly factor $A$.

    ----------------------------------------------------------------------------
            SCS v2.1.4 - Splitting Conic Solver
            (c) Brendan O'Donoghue, Stanford University, 2012
    ----------------------------------------------------------------------------

    Lin-sys: sparse-direct, nnz in A = 800000200
    eps = 1.00e-03, alpha = 1.50, max_iters = 250, normalize = 0
    acceleration_lookback = 10, rho_x = 1.00e-03
    Variables n = 20001, constraints m = 80200
    Cones:  sd vars: 80200, sd blks: 1
    Setup time: 8.28e+04s
    SCS using variable warm-starting
    ----------------------------------------------------------------------------
     Iter | pri res | dua res | rel gap | pri obj | dua obj | kap/tau | time (s)
    ----------------------------------------------------------------------------
         0| 1.44e+18  2.81e+19  1.00e+00 -1.24e+16 -8.91e+19  2.35e+19  2.60e+01
       100| 8.87e-04  5.67e-01  2.85e-02  3.30e-01  2.84e-01  1.58e-16  1.12e+03
       200| 1.40e-03  9.32e-02  6.29e-03  3.53e-01  3.42e-01  3.95e-17  2.21e+03
       250| 8.31e-03  2.07e+00  9.18e-02  3.64e-01  2.19e-01  6.42e-17  2.75e+03
    ----------------------------------------------------------------------------
    Status: Solved/Inaccurate
    Hit max_iters, solution may be inaccurate, returning best found solution.
    Timing: Solve time: 2.75e+03s
            Lin-sys: nnz in L factor: 1600080401, avg solve time: 1.07e+01s
            Cones: avg projection time: 1.72e-02s
            Acceleration: avg step time: 1.54e-02s
    ----------------------------------------------------------------------------
    Error metrics:
    dist(s, K) = 0.0000e+00, dist(y, K*) = 2.0612e-09, s'y/|s||y| = 4.3840e-12
    primal res: |Ax + s - b|_2 / (1 + |b|_2) = 1.4045e-03
    dual res:   |A'y + c|_2 / (1 + |c|_2) = 9.3231e-02
    rel gap:    |c'x + b'y| / (1 + |c'x| + |b'y|) = 6.2908e-03
    ----------------------------------------------------------------------------
    c'x = 0.3526, -b'y = 0.3419
    ============================================================================

kalmarek commented 3 years ago

tough luck; If you can't re-use A, then I don't think there's much to do about it; You can always ship your own linear solver though, see e.g. https://github.com/cvxgrp/scs/tree/master/linsys/cpu/direct

kmpape commented 3 years ago

Thanks a lot for your reply!

bodono commented 3 years ago

It is strange that the number of CG iterations is so high, and goes up. I would try the non-GPU indirect version, it might be something to do with the GPU. Also SCS 3.0 is very close to being released and that is overall more stable and faster, so you could try that (in the 3.0.0 branch).

kmpape commented 3 years ago

Thank you for your reply. That fixed matrix $\hat{R}$ is very badly conditioned, and it might well be that it results in a badly conditioned $A$. I will try your suggestions, and also linking MKL instead of QDLDL. Thank you for pointing me at the 3.0.0 branch.

cvxgrp / scs

Increased Solve Times for Sequence of Problems #168