cvxgrp / scs

Splitting Conic Solver
MIT License
553 stars 136 forks source link

SCS GPU convergence in a simple problem #206

Open zhouyou-gu opened 2 years ago

zhouyou-gu commented 2 years ago

Hi everyone,

I attempted to solve this example problem by using SCS GPU, CVXPY Example:Semidefinite program

However, I do not see any convergence in the problem. The output is attached below and the solver config is prob.solve(verbose=True, gpu=True, use_indirect=True,max_iters=10000)

May I have some hints on this issue? Many thanks in advance!

===============================================================================
                                     CVXPY                                     
                                    v1.1.18                                    
===============================================================================
(CVXPY) Jan 30 02:29:14 AM: Your problem has 9 variables, 4 constraints, and 0 parameters.
(CVXPY) Jan 30 02:29:14 AM: It is compliant with the following grammars: DCP, DQCP
(CVXPY) Jan 30 02:29:14 AM: (If you need to solve this problem multiple times, but with different data, consider using parameters.)
(CVXPY) Jan 30 02:29:14 AM: CVXPY will first compile your problem; then, it will invoke a numerical solver to obtain a solution.
-------------------------------------------------------------------------------
                                  Compilation                                  
-------------------------------------------------------------------------------
(CVXPY) Jan 30 02:29:14 AM: Compiling problem (target solver=SCS).
(CVXPY) Jan 30 02:29:14 AM: Reduction chain: Dcp2Cone -> CvxAttr2Constr -> ConeMatrixStuffing -> SCS
(CVXPY) Jan 30 02:29:14 AM: Applying reduction Dcp2Cone
(CVXPY) Jan 30 02:29:14 AM: Applying reduction CvxAttr2Constr
(CVXPY) Jan 30 02:29:14 AM: Applying reduction ConeMatrixStuffing
(CVXPY) Jan 30 02:29:14 AM: Applying reduction SCS
(CVXPY) Jan 30 02:29:14 AM: Finished problem compilation (took 7.578e-03 seconds).
-------------------------------------------------------------------------------
                                Numerical solver                               
-------------------------------------------------------------------------------
(CVXPY) Jan 30 02:29:14 AM: Invoking solver SCS  to obtain a solution.
------------------------------------------------------------------
           SCS v3.1.0 - Splitting Conic Solver
    (c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 6, constraints m: 9
cones:    z: primal zero / dual free vars: 3
      s: psd vars: 6, ssize: 1
settings: eps_abs: 1.0e-05, eps_rel: 1.0e-05, eps_infeas: 1.0e-07
      alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
      max_iters: 10000, normalize: 1, warm_start: 0
      acceleration_lookback: 10, acceleration_interval: 10
lin-sys:  sparse-indirect GPU
      nnz(A): 24, nnz(P): 0
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 5.96e+00  5.87e+00  9.34e+00 -7.11e+00  1.00e-01  1.52e-02 
   250| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  1.67e-02 
   500| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  1.83e-02 
   750| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  1.98e-02 
  1000| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  2.13e-02 
  1250| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  2.28e-02 
  1500| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  2.42e-02 
  1750| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  2.58e-02 
  2000| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  2.72e-02 
  2250| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  2.87e-02 
  2500| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  3.02e-02 
  2750| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  3.17e-02 
  3000| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  3.33e-02 
  3250| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  3.48e-02 
  3500| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  3.63e-02 
  3750| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  3.78e-02 
  4000| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  3.93e-02 
  4250| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  4.08e-02 
  4500| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  4.23e-02 
  4750| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  4.38e-02 
  5000| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  4.53e-02 
  5250| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  4.68e-02 
  5500| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  4.83e-02 
  5750| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  4.98e-02 
  6000| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  5.13e-02 
  6250| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  5.28e-02 
  6500| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  5.43e-02 
  6750| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  5.58e-02 
  7000| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  5.73e-02 
  7250| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  5.88e-02 
  7500| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  6.03e-02 
  7750| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  6.18e-02 
  8000| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  6.33e-02 
  8250| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  6.48e-02 
  8500| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  6.63e-02 
  8750| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  6.78e-02 
  9000| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  6.93e-02 
  9250| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  7.08e-02 
  9500| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  7.23e-02 
  9750| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  7.38e-02 
 10000| 3.59e+00  4.45e+00  4.68e+00 -3.55e+00  1.00e-01  7.53e-02 
------------------------------------------------------------------
status:  solved (inaccurate - reached max_iters)
timings: total: 9.72e-01s = setup: 8.96e-01s + solve: 7.53e-02s
     lin-sys: 5.15e-02s, cones: 4.40e-03s, accel: 6.20e-04s
------------------------------------------------------------------
objective = -3.548514 (inaccurate)
------------------------------------------------------------------
/usr/local/lib/python3.8/dist-packages/cvxpy/problems/problem.py:1296: UserWarning: Solution may be inaccurate. Try another solver, adjusting the solver settings, or solve with verbose=True for more information.
  warnings.warn(
-------------------------------------------------------------------------------
                                    Summary                                    
-------------------------------------------------------------------------------
(CVXPY) Jan 30 02:29:15 AM: Problem status: optimal_inaccurate
(CVXPY) Jan 30 02:29:15 AM: Optimal value: -5.889e+00
(CVXPY) Jan 30 02:29:15 AM: Compilation took 7.578e-03 seconds
(CVXPY) Jan 30 02:29:15 AM: Solver (including time spent in interface) took 9.757e-01 seconds
The optimal value is -5.889203673968009
A solution X is
[[-1.16502179  0.64147762 -0.2903137 ]
 [ 0.64147762 -0.71057197  0.61247697]
 [-0.2903137   0.61247697 -0.22595884]]
bodono commented 2 years ago

Yes there is something broken about the GPU solver for some machines (eg https://github.com/cvxgrp/scs/issues/180). What OS are you using and what GPU do you have?

I would recommend sticking to the direct cpu solver for now.

zhouyou-gu commented 2 years ago

Thanks for your reply! The program ran on a virtual machine with ubuntu 20.04, CUDA 11.2, GPU 2080Ti. I use scs-python interface at scs-python, which is compiled with --scs --gpu. I am not sure whether I compile everything correctly or not, even if I can run the program and it shows the GPU is connected.

I will look at the issue in your reply in detail and get back to update later.

Additional things I would like to ask about GPU acceleration. Currently, I am working on some real-time applications, and the solving time in my problem is at tens of milliseconds when using CPU and the amount of iterations is approximately 1k. How much do u think roughly GPU can reduce in solving time compared with CPU?

bodono commented 2 years ago

Thanks for the info. I'm not sure what is causing this issue and it's something we're looking at.

As for timing, in many cases the GPU solver is actually slower, so you're probably better off just sticking to the direct CPU solver. By the way, have you tried warm-starting the solver? That might provide some speedup. Also you can try using CVXPYs parametrized programming (https://www.cvxpy.org/tutorial/advanced/index.html#disciplined-parametrized-programming). Another thing is you should reuse the scale value that SCS determines works best from one solve to the next.

zhouyou-gu commented 2 years ago

Thx for the suggestions! I actually have tried every method you have mentioned to speed up the process :). Nevertheless, since GPU may not possibly be faster, I will just stick with CPU solver. Appreciate your help!