JuliaNLSolvers / Optim.jl

Optimization functions for Julia
Other
1.12k stars 221 forks source link

The role of Levenberg-Marquardt #207

Closed pkofod closed 7 years ago

pkofod commented 8 years ago

We currently have an active PR #196 so I thought this might be relevant to discuss the role and future of LM in Optim.jl.

I find it a bit strange, that it is a part of this package. Generally, I think of Optim as a package that takes a Julia function (and gradient and Hessian) and optimizes it (hopefully with better support for constraints in the future). LM does not fit in this framework at all, and I wonder if it should go just as nnls did with the API update. Sure it takes an f function and a g function, but there are very special assumptions about these.

It would require a few minutes worth of work to simply copy it line by line into LsqFit.jl (mostly by adding Optim. where appropriate). Is this something "we" want, though? Is it something LsqFit.jl wants?

matthieugomez commented 8 years ago

It could also make sense to put it in NLsolve. They are the exact same algorithms.

tpapp commented 8 years ago

From a certain perspective, it would make sense to have least squares algorithms in Optim.jl: they solve an optimization problem where the objective function has the form f(x)=∑ F_i(x)^2. It is true that this can be mapped to a nonlinear system whenever the optimal solution is such that F_i(x)=0 ∀ i, but this is not necessarily the case, especially when we have more parameters in x than functions F_i (a common situation in, eg, economics).

It is also true that the Levenberg-Marquardt algorithm has optimal convergence properties whenever F_i(x)≈0 near the optimum, but it is still applicable otherwise, you just drop back to linear convergence in the worst case.

Also, now that PR #196 is merged, the documentation could mention this algorithm, and indeed have a section on least squares algorithms. Perhaps someone would then contribute other ones, eg the hybrid algorithm of Dennis et al (1981).

pkofod commented 8 years ago

Thank you for joining the discussion @tpapp .

It is not that I am not aware that it is a solver, although a very specialized one. It is more if it fits in Optim. Now that of course depends on what Optim is supposed to be. Personally, I do not think it should contain methods as specialized as LM, but other might disagree.

I think the fact that you mention other algorithms for least squares is exactly why I think a dedicated least squares (there is even one out there already!) is more appropriate. There you can have a common interface for various least squares solvers. If we start adding more here, then we have to maintain one interface for "common optimization" and another one specifically for least squares. Now, I am very happy to have had @bjarthur begin active here in 2016, I just know I am not going to prioritize them myself.

I did not add documentation for two reasons. 1) honestly, I don't know too many details about the solver, 2) I had this issue in mind. However, if someone wants to add documentation, I would welcome a PR with open arms.

tpapp commented 8 years ago

One reason for including least squares routines in Optim (besides the fact that they are optimization routines for specialized problems) is sharing code for various common algorithms. Eg trust region and dampened methods have a similar general setup and require solving similar subproblems, Newton-type algorithms involve line search in both cases, etc. Perhaps they are not written to share this functionality at the moment, but IMO it would be great if they did whenever applicable.

OTOH I fully understand that having least squares would involve yet another interface. Perhaps a wrapper similar to DifferentiableFunction, eg SumofSquaredFunctions, could be used for dispatch: when optimize is called with such an object, it could automatically convert the function via a closure whenever general methods are used.

Regarding LM: in current master it does not have a method for Optim.optimize and a corresponding optimization object, so perhaps the first step would be remedying that. Unless recent contributors (eg @bjarthur) who already invested in understanding the code want take a shot at this, I could submit a PR.

ahwillia commented 8 years ago

I fully understand that having least squares would involve yet another interface. Perhaps a wrapper similar to DifferentiableFunction, eg SumofSquaredFunctions

I personally think this is too detailed an interface for Optim to support. If we wanted to support SumOfSquaredFunctions then we should also support SumOfFunctions for objectives like:

f(x) = f1(x) + f2(x) ... + fn(x)

And there are many more structured objective functions people may want to build specialized algorithms for in the future.

One reason for including least squares routines in Optim (besides the fact that they are optimization routines for specialized problems) is sharing code for various common algorithms.

But that doesn't mean LM needs to live in Optim. Why not have it live in a separate repository (LsqFit.jl?) and have that repository import any code that you'd like to re-use from Optim.

If a centralized repo is wanted, we could create a OptimizationSuite.jl that bundles and re-exports all the best of the JuliaOpt packages.

pkofod commented 7 years ago

done https://github.com/JuliaOpt/LsqFit.jl/pull/20