Why does the default optimizer perform so poorly on this simple problem?

lesshaste commented 5 years ago

Take this simple code:

from symfit import parameters, Eq, Ge, Fit
from symfit.core.minimizers import BasinHopping
import numpy as np
n = 3
xdata = [2, 8, 11]
print(xdata)
p1, p2, p3 = parameters('p1, p2, p3')
model = p1*p2*p3
constraints = [
    Eq(xdata[0]*p1+(xdata[1]-xdata[0])*p2+(xdata[2]-xdata[1])*p3, 1),
    Ge(p1, p2),
    Ge(p2, p3),
    Ge(p3, 0)
    ]

fit1 = Fit(- model, constraints=constraints, minimizer=BasinHopping)
print(fit1.execute())
fit2 = Fit(- model, constraints=constraints)
print(fit2.execute())

fit1 gives:

p1 1.664284e-01 nan p2 7.412702e-02 nan p3 7.412702e-02 nan

with objective function value: -0.00091449

Whereas fit2 gives:

p1 5.000000e-01 nan p2 -1.554312e-15 nan p3 1.998401e-15 nan

with objective function value: -1.553069327556e-30

fit2 reports Fitting status message: Optimization terminated successfully.

Why does fit2 (the default optimizer) perform so poorly on this relatively simple optimization problem?

pckroon commented 5 years ago

In short? Because minimization is hard. One of the problems here (I think) is that the jacobian of the model is the product of the other parameters, and as soon as two of those become (close to) 0, the jacobian is 0, and the minimization has converged. This is acerbated by the model and constraints, which drives everything towards 0. You can try the COBYLA minimizer, but that might only accept inequality constraints (the scipy docs are vague on the issue), or give different (better) initial guesses for your parameters.

What I find most worrying here though, is that the scipy docs for Basinhopping make no mention of constraints, so I have the feeling that fit1 should throw an error. Could you try to evaluate your constraints to see if they're satisfied?

@tBuLi do we have extended Lagrangian methods on the roadmap?

tBuLi commented 5 years ago

Yeah indeed, I think @pckroon is onto something. Driving one of the parameters to zero is the easiest way to optimize this problem. Perhaps you can set the .min of all the parameters to something slightly bigger than zero to guide it in the right direction. You can also try TrustConstr instead of SLSQP. (Not sure if this is in 0.4.6 yet, otherwise have some patience or run from master.)

Additionally I don't remember if BasinHopping will respect the constraints, check the scipy docs for that. (And indeed, evaluate the constraints to make sure, this is something I will hopefully add to FitResults this weekend.)

I do still dream of adding lagrange multipliers for equality constraints directly on the level of the model, but realistically I'm not expecting that feature any time soon because it still requires some work.

lesshaste commented 5 years ago

Thanks for the reply.

The constraints are satisfied by BasinHopping as 2*1.664644e-01+6*7.411903e-02+3*7.411903e-02 = 1.00000007

On the other hand, COBYLA gives me:ValueError: Constraints of type 'eq' not handled by COBYLA.

If I constrain all the parameters to be at least 0.001 then fit2 gives the right answer. Any smaller (e.g. 0.0001) and it doesn't).

It looks like TrustrConstr is in 0.4.7.dev3 but not 0.4.6 which is what pip install gives me.

Lagrange multipliers would be an awesome addition.

Is there a plan to add dual_annealing from scipy too?

pckroon commented 5 years ago

On the other hand, COBYLA gives me:ValueError: Constraints of type 'eq' not handled by COBYLA.

I kind of expected it would. What you could do is turn it into two inequality constraints with some margin (e.g. expr < 1 + 1e-3 and expr > 1 - 1e-3).

If I constrain all the parameters to be at least 0.001 then fit2 gives the right answer. Any smaller (e.g. 0.0001) and it doesn't).

Once two of those parameters become 1e-4 the jacobian for the third becomes 1e-4**2 == 1e-8, which is probably small enough for the algorithm in question to think it's converged to a minimum (and I don't blame it).

Is there a plan to add dual_annealing from scipy too?

Yes, but I'm not sure on what timescale. We (@tBuLi and me) still need to discuss what we want with versions of dependencies and backward compatibility. That's not a discussion for here though. dual_annealing specifically is going to be slightly tricky to do since we want to hide away the scipy interface completely, also for the local minimizer, and I don't see how to do that right now. It might be obvious, but it probably requires some thought. shgo should be trivial to wrap though.

I do still dream of adding lagrange multipliers for equality constraints directly on the level of the model, but realistically I'm not expecting that feature any time soon because it still requires some work.

Why only for equality constraints and not for constraints in general? In my limited view constraint violations would become a penalty on top of the objective function, but it'll be tricky to combine the two domains. Also, can't we borrow or translate the implementation from somewhere else? See also https://nlopt.readthedocs.io/en/latest/NLopt_Algorithms/ under "Augmented Lagrangian algorithm" and the references there. In addition, nlopt is open source, but mostly in C(++)/Fortran. See https://github.com/stevengj/nlopt

tBuLi commented 5 years ago

Once two of those parameters become 1e-4 the jacobian for the third becomes 1e-4**2 == 1e-8, which is probably small enough for the algorithm in question to think it's converged to a minimum (and I don't blame it).

If that is indeed the problem, I think you should try the following: fit2.execute(tol=1e-15). (Or even smaller values) I think this could immediately solve the problem, related to #217.

Why only for equality constraints and not for constraints in general?

Simple answer: because I'm quite familiar with the mathematics to do that, but not at all with Karush–Kuhn–Tucker conditions :). So Equality would be a quick first step, the rest requires quite a bit more reading. But of course the code should be planned in such a way that this could be added later with that same API.

pckroon commented 5 years ago

If that is indeed the problem, I think you should try the following: fit2.execute(tol=1e-15). (Or even smaller values) I think this could immediately solve the problem, related to #217.

Careful there! That has to do with the change in the parameters w.r.t. the previous step, not necessarily the value of the Jacobian, and therefore depends on how the algorithm decides the step size to take! Alternatively you can pass options={"ftol": 1e-9} to execute, which deals with the change in the objective function.

lesshaste commented 5 years ago

I tried

fit2 = Fit(- model, constraints=constraints)
print(fit2.execute(options={"ftol": 1e-9}))

but it didn't help. However print(fit2.execute(options={"ftol": 1e-10)) did work.

tBuLi commented 5 years ago

Excellent. You seem to have quite an interesting minimization problem on your hands.

pckroon commented 5 years ago

I think this is resolved since the fit now works :) Feel free to reopen this issue or open a new one if you have further questions! @tBuLi Could you archive/remember the off-topic notes here for future reference?

tBuLi / symfit

Why does the default optimizer perform so poorly on this simple problem? #227