A flaw in the algorithm

jjhelmus / leastsqbound-scipy

Constrained multivariate least-squares optimizer for scipy

Other

21 stars 8 forks source link

Hi, guys! I'm afraid I came back with somewhat bad news.

Let's examine transformations used in leastsqbound. A bounded variable is on a boundary, when a corresponding internal variable ...

... is equal to pi or -pi when both min and max bounds are specified.
... is equal to 0 when either min or max are specified.

It's easy to see, that when a bounded variable is on a boundary (takes min or max value), then dP_boundary / dP_internal = 0. It means that all derivatives with respect to such internal variable are equal to zero, i. e. the algorithm treats any variable on a boundary as optimal, and any point on a boundary as an optimal point.

In other word the algorithm can't distinguish between optimal and non-optimal bounds (a variable is optimal on a boundary when a partial derivative points inside a feasible region). As a result the algorithm can be attracted by any boundary.

As an illustration see problem "Watson9_B" in my benchmark. In this problem the starting point is almost on the boundary, leastsqbound makes only 13 iterations and terminates with 12 active constraints, but the solution is not at all optimal.

Unfortunately, it can't be fixed in the framework of leastsqbound. In a lot of cases the algorithm will find true (near) optimal solution, but people should be aware of this issue and its consequences.

This is a known limitation when using a variable transform to enforce boundary conditions. Section 1.3 and 5.3 of the MINUIT User's Guide provides details on the transforms used and their limitation. Since the transforms are non-linear the algorithm can behave poorly near one of the limits. Also, the minimizer can become blocked near a limit where the derivate becomes 0 due to the transformation. Finally, the covariance matrix does give meaningful results if one of the optimization parameters takes a value close to the limits. The user's guide provides the following recommendations:

Limits on variable parameters should be used only when needed in order to prevent the parameter from taking on unphysical values.
When a satisfactory minimum has been found using limits, the limits should then be removed if possible, in order to perform or re-perform the error analysis without limits.

Fundamentally bounds should be avoided on fitting parameters unless absolutely necessary. Even then, the above effects should be considered. If optimization results in a parameter near the limits the results should be rerun either with wider limits or by fixing the value of that parameter at that limit (lmfit makes this process easier than leastsqbound). Starting an optimization with the initial value of a parameter near the limit should be avoided entirely.

I'll add some notes about this in the documentation of the function to give users some basic details on the issue and point them to the relevant sections of the MINUIT user's guide for addition details.

jjhelmus / leastsqbound-scipy

A flaw in the algorithm #9