Open Alaya-in-Matrix opened 8 years ago
In fact, SLSQP fails whenever the gradients are large. Even if the function is linear. (But of course also for non-linear, non-quadratic functions.)
See also those old mailing list threads from 2011:
I have converted Alexander Riess's example from half Python, half pseudocode to compilable C (riesstest.c
):
#include <math.h>
#include <nlopt.h>
#include <stdio.h>
#define s 1e8
double myfunc(unsigned n, const double *x, double *grad, void *my_func_data)
{
if (grad) {
grad[0] = -s;
}
return -s*x[0];
}
int main(void) {
nlopt_opt opt = nlopt_create(NLOPT_LD_SLSQP, 1);
nlopt_set_lower_bounds1(opt, -1.);
nlopt_set_upper_bounds1(opt, 1.);
nlopt_set_min_objective(opt, myfunc, NULL);
nlopt_set_xtol_abs1(opt, 1e-10);
nlopt_set_xtol_rel(opt, 1e-10);
nlopt_set_ftol_abs(opt, 1e-10);
nlopt_set_ftol_rel(opt, 1e-10);
double x[] = {0.};
double minf;
int status = 0;
if (nlopt_optimize(opt, x, &minf) < 0) {
printf("nlopt failed!\n");
status = 1;
}
else {
printf("found minimum at f(%g) = %0.10g\n", x[0], minf);
}
nlopt_destroy(opt);
return status;
}
and added it to the tests CMakeLists.txt
:
add_executable (riesstest riesstest.c)
target_link_libraries (riesstest ${nlopt_lib})
target_include_directories (riesstest PRIVATE ${NLOPT_PRIVATE_INCLUDE_DIRS})
add_dependencies (tests riesstest)
so I was able to run the thing through a debugger.
I tracked down the issue to a defect in LSEI which seems to also have been known for ages, see:
Where it all breaks down is this line in the subroutine ldp_
:
fac = one - ddot_sl__(m, &h__[1], 1, &w[iy], 1);
(which computes the inner product between the input vector H (h__
) and the solution from nnls_
(&w[iy]
)). The resulting fac
must not be zero (and in fact fac+one
must not round to one, this is checked right below the above line, though even relaxing that check would not fix it, see the explanation below). But in the Riess test case, fac
is approximately 1/s², e.g., for s=1e3, fac
is around 1e-6. For s=1e8, fac
is around 1e-16 and completely eaten up by cancellation, resulting in a zero fac
(exactly zero, so even checking only for exactly zero would not fix anything) and an error.
In addition, the solution nnls_
returns also hits roundoff limits for s=1e8: the second component is 1.0000000100000002e-08
, and that 2
is where digits we would actually need start. So even if we somehow manage to compute fac
exactly, it would still fail because the &w[iy]
computed by nnls_
is already destroyed by roundoff cancellation.
A workaround I have found to work in my application, where I am trying to minimize Gaussian Mixture Models (GMMs), where the large gradients are not actually at the optimum, is to lie to SLSQP about the gradients, scaling them down to an infinity norm of 1000 if their actual infinity norm is larger:
/* if the gradient is too large, lie, or SLSQP will fail */
double scale=1.;
int i;
for (i=0; i<DIMX; i++) {
/* for some reason, a non-power-of-2 works better here */
#define GRAD_MAX 1000.
if (fabs(scale*grad_f[i]) > GRAD_MAX) {
scale=GRAD_MAX/grad_f[i];
}
}
if (scale<1.) {
for (i=0; i<DIMX; i++) {
grad_f[i]*=scale;
}
}
As I said, this works for me in my application. It will not necessarily do something reasonable in other applications, especially if they have huge gradients everywhere, including at the optimum. Though it seems to also work at least in the Riess test, where the above can be simplified to:
if (grad[0] < -1000.) grad[0] = -1000.;
But it may or may not work in other applications.
I think ultimately the only way to solve this might be to use a different QP solver instead of the bundled LSEI code. The original SLSQP code recommended QPSOL, but that is a proprietary code that is not publicly available. But the Apache-licensed OSQP might be an option nowadays.
Reading the description of the LSEI algorithms (in particular, the LSQ algorithm), I have since realized that the failure to compute fac
can theoretically be fixed by replacing:
fac = one - ddot_sl__(m, &h__[1], 1, &w[iy], 1);
d__1 = one + fac;
if (d__1 - one <= 0.0) {
goto L50;
}
with:
fac = rnorm * rnorm;
(because what is computed in the first snippet is -r{n+1}, which is actually proven to be ||r||², and the rnorm
returned by `nnls` is fairly accurate even in the ill-conditioned cases), but then the algorithm still breaks down because of rounding errors in some other step(s) of the LSEI procedure. (Maybe the computation of r_1 to r_n and x_1 to x_n below in the same function? I am not sure yet where the problem(s) is/are.)
The reason the algorithm still breaks down even if I change the computation of fac
as above is that the solution from nnls_
is just too inaccurate to obtain an accurate ldp_
solution (the relative errors are around 1e-8), then the back-transformation done in lsi_
(daxpy_sl__(n, &one, &f[1], 1, &x[1], 1);
) ends up canceling out all the accurate digits (because f is 1e8 and x is supposed to be 1-1e8, but actually ends up around -4-1e8).
The main issue with the Lawson-Hanson least-squares routines bundled in SLSQP is that they have to transform the problem into a simpler problem step by step. LSEI ends up doing 3 transformations: LSEI→LSI→LDP→NNLS. Each of the transformations worsens the condition of the problem. Ultimately, the transformations done seem (empirically – I have not done a formal error analysis) to end up squaring the condition number, i.e., double precision with its relative accuracy of approximately 2e-16 ends up insufficient for a gradient 1e8.
I am benchmarking the SLSQP algorithm with a simple 10D quadratic function:
To my suprise, SLSQP fails for this simple function, but if I switch algorithm to
nlopt::LD_LBFGS
, nlopt still optimize the function efficiently.The version of NLOPT is 2.4.2(given by
nlopt::version
), below is the full code: