PerformanceEstimation / PEPit

PEPit is a package enabling computer-assisted worst-case analyses of first-order optimization methods.
https://pepit.readthedocs.io/en/latest/
MIT License
76 stars 9 forks source link

Numerical Issues with StrongSmoothlyConvexQuadraticFunction #97

Open vinitranjan1 opened 5 months ago

vinitranjan1 commented 5 months ago

Hello @AdrienTaylor and @NizarBousselmi,

First, thank you for creating this library! I am working with @bstellato on verification methods specific to parametric QPs here and used PEPit for comparison purposes in our paper. While working with the StrongSmoothlyConvexQuadraticFunction object, I ran into numerical issues with the following example:

import cvxpy as cp
from PEPit import PEP
from PEPit.functions import (
    ConvexFunction,
    SmoothStronglyConvexQuadraticFunction,
)
from PEPit.primitive_steps import proximal_step

def test_quad(mu, L, K, t, r):
    pepit_verbose = 2
    problem = PEP()

    # proximal gradient descent for sum of quadratic and convex function
    func1 = problem.declare_function(ConvexFunction)
    func2 = problem.declare_function(SmoothStronglyConvexQuadraticFunction, L=L, mu=mu)
    func = func1 + func2

    xs = func.stationary_point()
    x0 = problem.set_initial_point()

    x = [x0 for _ in range(K + 1)]
    for i in range(K):
        x[i+1], _, _ = proximal_step(x[i] - t * func2.gradient(x[i]), func1, t)

    problem.set_initial_condition((x0 - xs) ** 2 <= r ** 2)

    # Fixed point residual
    problem.set_performance_metric((x[-1] - x[-2]) ** 2)

    pepit_tau = problem.solve(verbose=pepit_verbose, wrapper='mosek')
    # pepit_tau = problem.solve(verbose=pepit_verbose, solver=cp.MOSEK)
    # pepit_tau = problem.solve(verbose=pepit_verbose, solver=cp.CLARABEL)

    print('pepit_tau:', pepit_tau)
    return pepit_tau

mu = 1
L = 10
K = 6
t = 0.0125
r = 10

test_quad(mu, L, K, t, r)

Solving with the mosek wrapper directly.

In this case, with

pepit_tau = problem.solve(verbose=pepit_verbose, wrapper='mosek')

I notice that the code errors out with an assertion error in the feasibility check for the PEP object:

 File "/opt/homebrew/Caskroom/miniconda/base/envs/algover_test/lib/python3.9/site-packages/PEPit/pep.py", line 680, in check_feasibility
    assert wc_value == self.objective.eval()
AssertionError

It seems that the final value from Mosek is extracted incorrectly when performing the check as the wc_value is 0 but self.objective.eval() correctly retrieves the answer from Mosek.

Solving through CVXPY

However, when solving through CVXPY to try other solvers, I noticed numerical issues. When running the same instance with an uncommented line, either

pepit_tau = problem.solve(verbose=pepit_verbose, solver=cp.MOSEK)

or

pepit_tau = problem.solve(verbose=pepit_verbose, solver=cp.CLARABEL)

the solvers error out with numerical issues when using the default tolerances.

For example, Mosek reports

(CVXPY) Mar 25 06:11:46 PM: Interior-point solution summary
(CVXPY) Mar 25 06:11:46 PM:   Problem status  : UNKNOWN
(CVXPY) Mar 25 06:11:46 PM:   Solution status : UNKNOWN
(CVXPY) Mar 25 06:11:46 PM:   Primal.  obj: -2.5809495701e-01   nrm: 2e+02    Viol.  con: 5e-04    var: 3e-06    barvar: 0e+00
(CVXPY) Mar 25 06:11:46 PM:   Dual.    obj: -2.6290013277e-01   nrm: 3e+03    Viol.  con: 0e+00    var: 1e-03    barvar: 1e-05

and Clarabel reports

iter    pcost        dcost       gap       pres      dres      k/t        μ       step
---------------------------------------------------------------------------------------------
  0  -2.1533e-04  -1.5070e+02  1.51e+02  3.75e-01  3.12e-02  1.00e+00  3.20e+01   ------
  1  -2.1533e-04  -1.5070e+02  1.51e+02  3.75e-01  3.12e-02  1.00e+00  3.20e+01  0.00e+00
---------------------------------------------------------------------------------------------
Terminated with status = NumericalError

It seems there is some numeric issue in the problem formulation that gets fed into cvxpy. When using a more general class such as SmoothStronglyConvexFunction for example, all of these issues disappear and the three methods agree.

vinitranjan1 commented 5 months ago

As a followup to this, we were able to use the mosek wrapper directly by try-catching the AssertionError and then extracting the solution from the PEP object itself using problem.objective.eval(). So in our experiments we were able to at least have a better comparison using the StrongSmoothlyConvexQuadraticFunction object and still solve the SDPs with Mosek's default tolerances to update our preprint with the new results.

AdrienTaylor commented 5 months ago

Thank you @vinitranjan1 for raising this issue!

We are currently in deadline mode, but we have started looking at it and will get back to you ASAP! (also, thanks @bgoujaud and @CMoucer for all this work ;-) )

AdrienTaylor commented 5 months ago

@vinitranjan1 Thanks again! I have found an issue in our implementation, in particular in the implementation of the class of quadratic functions (@NizarBousselmi , @bgoujaud ) which, from what I can tell, is due to wrong specifications regarding x* (for the quadratic function). For instance: it implicitly assumes that the minimizer x of the quadratic function is f(x_) = 0 but only in some parts of the code.

With the direct interface to MOSEK, the following quick (dirty) fix seems to "work" (I've added the line xs_f2 = func2.stationary_point() to force the existence of an optimal point for the quadratic function), but I'll dig more into it and come back to you within the following days. The following code might not be the thingthat ultimately does what we want (so I would not use that in a paper just now; we have to investigate it a bit more & provide a fix) , but at least parts of origin of the bug are clear. Thanks a lot again!

` def test_quad(mu, L, K, t, r): pepit_verbose = 2 problem = PEP()

# proximal gradient descent for sum of quadratic and convex function
func1 = problem.declare_function(ConvexFunction)
func2 = problem.declare_function(SmoothStronglyConvexQuadraticFunction, L=L, mu=mu)
func = func1 + func2

xs = func.stationary_point()
xs_f2 = func2.stationary_point()
x0 = problem.set_initial_point()

x = [x0 for _ in range(K + 1)]
for i in range(K):
    x[i+1], _, _ = proximal_step(x[i] - t * func2.gradient(x[i]), func1, t)

problem.set_initial_condition((x0 - xs) ** 2 <= r ** 2)

# Fixed point residual
problem.set_performance_metric((x[-1] - x[-2]) ** 2)

pepit_tau = problem.solve(verbose=pepit_verbose, wrapper='mosek')
#pepit_tau = problem.solve(verbose=pepit_verbose, solver=cp.MOSEK)
#pepit_tau = problem.solve(verbose=pepit_verbose, solver=cp.CLARABEL)

print('pepit_tau:', pepit_tau)
return pepit_tau

``

@NizarBousselmi It would be nice if you could check the codes implementing quadratic interpolation [also not with the implicit assumption that x_*=0 which was in your original pull request].

AdrienTaylor commented 4 months ago

Hello, @vinitranjan1!

Unfortunately, some problems remain (mostly due to problems on the solvers' side). There is quite a bit of sensitivity in the output, so K should not be too big and the step size should stay reasonable as well (also not too small).

Overall, the conclusion seems to be that

import numpy as np
import cvxpy as cp

from PEPit import PEP
from PEPit.functions import (
    ConvexFunction,
    SmoothStronglyConvexQuadraticFunction,
    SmoothStronglyConvexFunction,
)
from PEPit.primitive_steps import proximal_step

def test_quad(mu, L, K, t, r):
    pepit_verbose = 2
    problem = PEP()

    # proximal gradient descent for sum of quadratic and convex function
    func1 = problem.declare_function(ConvexFunction)
    func2 = problem.declare_function(SmoothStronglyConvexQuadraticFunction, L=L, mu=mu)
    func = func1 + func2

    xs = func.stationary_point()
    #xs_f2 = func2.stationary_point()
    #problem.add_constraint( func2(xs_f2) == 0 )
    x0 = problem.set_initial_point()

    x = [x0 for _ in range(K + 1)]
    for i in range(K):
        x[i+1], _, _ = proximal_step(x[i] - t * func2.gradient(x[i]), func1, t)

    problem.set_initial_condition((x0 - xs) ** 2 <= r ** 2)

    # Fixed point residual
    problem.set_performance_metric((x[-1] - x[-2]) ** 2)

    ## Those two options appear to work better (and to match):
    #pepit_tau = problem.solve(verbose=pepit_verbose, wrapper='mosek')
    pepit_tau = problem.solve(verbose=pepit_verbose, solver=cp.CLARABEL)
    # the conclusion seem to be that we should feature a direct interface to clarabel as well.

    ## Those three options are really not great (so probably partly due to the CVXPY model):
    #pepit_tau = problem.solve(verbose=pepit_verbose, solver=cp.MOSEK)
    #pepit_tau = problem.solve(verbose=pepit_verbose, solver=cp.SCS)
    #pepit_tau = problem.solve(verbose=pepit_verbose)

    print('pepit_tau:', pepit_tau)
    return pepit_tau

if __name__ == "__main__":
    mu = 1
    L = 10
    K = 10
    t = 1/L/10
    r = 1
    pepit_tau = test_quad(mu, L, K, t, r)