Open Alaya-in-Matrix opened 8 years ago
In fact, SLSQP fails whenever the gradients are large. Even if the function is linear. (But of course also for non-linear, non-quadratic functions.)
See also those old mailing list threads from 2011:
I have converted Alexander Riess's example from half Python, half pseudocode to compilable C (riesstest.c
):
#include <math.h>
#include <nlopt.h>
#include <stdio.h>
#define s 1e8
double myfunc(unsigned n, const double *x, double *grad, void *my_func_data)
{
if (grad) {
grad[0] = -s;
}
return -s*x[0];
}
int main(void) {
nlopt_opt opt = nlopt_create(NLOPT_LD_SLSQP, 1);
nlopt_set_lower_bounds1(opt, -1.);
nlopt_set_upper_bounds1(opt, 1.);
nlopt_set_min_objective(opt, myfunc, NULL);
nlopt_set_xtol_abs1(opt, 1e-10);
nlopt_set_xtol_rel(opt, 1e-10);
nlopt_set_ftol_abs(opt, 1e-10);
nlopt_set_ftol_rel(opt, 1e-10);
double x[] = {0.};
double minf;
int status = 0;
if (nlopt_optimize(opt, x, &minf) < 0) {
printf("nlopt failed!\n");
status = 1;
}
else {
printf("found minimum at f(%g) = %0.10g\n", x[0], minf);
}
nlopt_destroy(opt);
return status;
}
and added it to the tests CMakeLists.txt
:
add_executable (riesstest riesstest.c)
target_link_libraries (riesstest ${nlopt_lib})
target_include_directories (riesstest PRIVATE ${NLOPT_PRIVATE_INCLUDE_DIRS})
add_dependencies (tests riesstest)
so I was able to run the thing through a debugger.
I tracked down the issue to a defect in LSEI which seems to also have been known for ages, see:
Where it all breaks down is this line in the subroutine ldp_
:
fac = one - ddot_sl__(m, &h__[1], 1, &w[iy], 1);
(which computes the inner product between the input vector H (h__
) and the solution from nnls_
(&w[iy]
)). The resulting fac
must not be zero (and in fact fac+one
must not round to one, this is checked right below the above line, though even relaxing that check would not fix it, see the explanation below). But in the Riess test case, fac
is approximately 1/s², e.g., for s=1e3, fac
is around 1e-6. For s=1e8, fac
is around 1e-16 and completely eaten up by cancellation, resulting in a zero fac
(exactly zero, so even checking only for exactly zero would not fix anything) and an error.
In addition, the solution nnls_
returns also hits roundoff limits for s=1e8: the second component is 1.0000000100000002e-08
, and that 2
is where digits we would actually need start. So even if we somehow manage to compute fac
exactly, it would still fail because the &w[iy]
computed by nnls_
is already destroyed by roundoff cancellation.
A workaround I have found to work in my application, where I am trying to minimize Gaussian Mixture Models (GMMs), where the large gradients are not actually at the optimum, is to lie to SLSQP about the gradients, scaling them down to an infinity norm of 1000 if their actual infinity norm is larger:
/* if the gradient is too large, lie, or SLSQP will fail */
double scale=1.;
int i;
for (i=0; i<DIMX; i++) {
/* for some reason, a non-power-of-2 works better here */
#define GRAD_MAX 1000.
if (fabs(scale*grad_f[i]) > GRAD_MAX) {
scale=GRAD_MAX/grad_f[i];
}
}
if (scale<1.) {
for (i=0; i<DIMX; i++) {
grad_f[i]*=scale;
}
}
As I said, this works for me in my application. It will not necessarily do something reasonable in other applications, especially if they have huge gradients everywhere, including at the optimum. Though it seems to also work at least in the Riess test, where the above can be simplified to:
if (grad[0] < -1000.) grad[0] = -1000.;
But it may or may not work in other applications.
I think ultimately the only way to solve this might be to use a different QP solver instead of the bundled LSEI code. The original SLSQP code recommended QPSOL, but that is a proprietary code that is not publicly available. But the Apache-licensed OSQP might be an option nowadays.
Reading the description of the LSEI algorithms (in particular, the LSQ algorithm), I have since realized that the failure to compute fac
can theoretically be fixed by replacing:
fac = one - ddot_sl__(m, &h__[1], 1, &w[iy], 1);
d__1 = one + fac;
if (d__1 - one <= 0.0) {
goto L50;
}
with:
fac = rnorm * rnorm;
(because what is computed in the first snippet is -r{n+1}, which is actually proven to be ||r||², and the rnorm
returned by `nnls` is fairly accurate even in the ill-conditioned cases), but then the algorithm still breaks down because of rounding errors in some other step(s) of the LSEI procedure. (Maybe the computation of r_1 to r_n and x_1 to x_n below in the same function? I am not sure yet where the problem(s) is/are.)
The reason the algorithm still breaks down even if I change the computation of fac
as above is that the solution from nnls_
is just too inaccurate to obtain an accurate ldp_
solution (the relative errors are around 1e-8), then the back-transformation done in lsi_
(daxpy_sl__(n, &one, &f[1], 1, &x[1], 1);
) ends up canceling out all the accurate digits (because f is 1e8 and x is supposed to be 1-1e8, but actually ends up around -4-1e8).
The main issue with the Lawson-Hanson least-squares routines bundled in SLSQP is that they have to transform the problem into a simpler problem step by step. LSEI ends up doing 3 transformations: LSEI→LSI→LDP→NNLS. Each of the transformations worsens the condition of the problem. Ultimately, the transformations done seem (empirically – I have not done a formal error analysis) to end up squaring the condition number, i.e., double precision with its relative accuracy of approximately 2e-16 ends up insufficient for a gradient 1e8.
In https://github.com/kkofler/nlopt/commits/slsqp-osqp/ I have implemented an experimental OSQP backend for SLSQP.
Currently, the backend (OSQP or LSEI) has to be selected at compile time. It should probably be selectable as a parameter to SLSQP, if OSQP is compiled in at all.
To compile with OSQP support, you currently need to have OSQP compiled separately and installed somewhere. I have not bundled it.
With OSQP enabled, SLSQP now passes the existing tests plus the Alexander Riess test above up to s=1e31 (s=1e32 fails even with OSQP).
I have not yet tried it in my application.
Note that I am using OSQP to solve the factored least-squares form of the QP, brought into QP form by a variable transformation as per: https://osqp.org/docs/release-0.6.3/examples/least-squares.html
OSQP would also be able to solve the expanded QP form minimizing (EX-F)'(EX-F)=X'E'EX-2(E'F)'X+F'F, where the constant term can be ignored, E'E=L√D√DL'=LDL'=H and E'F=L√D(-inv(√D))inv(L)G=-G. Though lsq_
gets passed H only in factored form, so one would have to multiply it back up to H, and then OSQP will factor it again.
Be warned that the OSQP backend linked above is still experimental. I have run into problems where OSQP simply will not converge (and my attempts at tweaking the OSQP settings have been unsuccessful so far).
To repair the Lawson-Hanson algorithm, we would probably need to introduce scaling factors β and γ as in Section III of http://cse.lab.imtlucca.it/~bemporad/publications/papers/ieeetac-proxnnls.pdf – however, the step-by-step transformation from QP to NNLS (where SLSQP starts from the LSEI step) is sufficiently different from Bemporad's one-step transformation that I am not sure how to introduce those parameters correctly.
I am benchmarking the SLSQP algorithm with a simple 10D quadratic function:
To my suprise, SLSQP fails for this simple function, but if I switch algorithm to
nlopt::LD_LBFGS
, nlopt still optimize the function efficiently.The version of NLOPT is 2.4.2(given by
nlopt::version
), below is the full code: