Merge Optimization Efforts

scottstephens commented 9 years ago

As discussed in https://github.com/mathnet/mathnet-numerics/pull/173, we've got a few independent efforts to add optimization to mathnet-numerics, and we would like to merge these efforts, and eventually get the code into the mathnet-numerics master.

I'm starting this issue so we have a central place to discuss these efforts, since it will likely involve multiple pull requests.

The first thing we need to do is get the various contributors to settle on an API. To that end, I've created a pull request (https://github.com/mathnet/mathnet-numerics/pull/282) with the start of some experimental code. Right now, it's basically got the API, and an implementation of a Newton unconstrained minimizer. It's branched from the current mathnet-numerics master.

Suggested next steps: (1) @cdrnet should pull my branch into it's own mathnet branch (2) @joemoorhouse should add an algorithm that calls out to native code using the candidate API to see how he likes working with it/how native code fits in. (3) Any interested parties, including at a minimum me, @joemoorhouse, and @cdrnet should discuss the API (4) Once the API is settled, we can add additional algorithms.

cdrnet commented 9 years ago

(1) Done: unified_optimization branch.

joemoorhouse commented 9 years ago

Hi Scott, Just had a look at the API, and personally I like the concept. I think it is nice to have an interface in which the Minimizer asks for the function value, gradient or Hessian (and in turns says when the point at which these are calculated has changed). I think that is going to give enough flexibility where calculations of these share computationally intensive components.

I have one idea (which might or might not be a good one! :)): I have been playing around with the interfaces and wonder whether it is possible to simplify (arguably) somewhat, because I see that the IEvaluation is already responsible for calculating and storing the function value, gradient and Hessian. I wonder if we can combine IObjectiveFunction and IEvaluation and the objects that implement these into one. The single interface would be:

public interface IObjectiveFunction
{
    Vector<double> Point { get; set; }
    double Value { get; }
    Vector<double> Gradient { get; }
    Matrix<double> Hessian { get; }
}

I think that EvaluationStatus can (even should?) be part of the internals. The setter of the Point could be used to indicate when this has been changed and be used to reset EvaluationStatus.

Anyway, I have put an illustration of this here (and can flesh out more if required).

https://github.com/joemoorhouse/mathnet-numerics/blob/unified_optimization/src/Numerics/Optimization/ObjectiveFunctionAlternative.cs

Just an idea to get the discussion going(and I might well be missing something!). As you will see, just a variation on the current API as well, sharing all the same concepts.

Thanks, Joe

grovesNL commented 9 years ago

I'm very interested in non-linear optimization with this library. I noticed that there is a "Nonlinear Optimization" item listed under "Optimization" in the Math.NET documentation, but it appears it's still under development based on this discussion.

Is this the latest discussion? I would like for the API to stabilize before use in production code.

ndalchau commented 9 years ago

Similar question from me. Are there plans to make Levenberg-Marquardht available?

cdrnet commented 9 years ago

Yes - it would actually be very helpful if you could quickly review whether this proposal would work well in your particular scenarios (from the API/architecture point of view).

scottstephens commented 9 years ago

The API is not yet stable.

@ndalchau: @joemoorhouse's optimization branch has a wrapper around the MKL non-linear least squares optimizer. I suspect he will contribute this to the final version once the API stabilizes. Not sure if this is LM or not. If you're not wedded to the algorithm, I generally use BFGS-B for small-scale nonlinear least squares problems, and that will definitely be included.

ndalchau commented 9 years ago

Generally the architecture looks fine. BFGS-B would be a great addition, but for most applications I need optimisers for which I cannot provide derivatives information. I believe you need to specify gradients/hessians for BFGS, right?

scottstephens commented 9 years ago

@ndalchau : BFGS needs gradients, but not hessians. However, Levenberg–Marquardt also needs gradients. If other libraries were letting you get away without specifying gradients it was because they're automatically doing either finite difference or automatic differentiation gradient calculations under the hood, not anything to do with Levenberg–Marquardt. It's not implemented currently, but it wouldn't be very hard to write an objective function class that takes a value-only objective function class and transforms it into a value + gradient objective function class by doing finite difference. That's roughly the plan I have for adding support for "gradient-free" optimization.

ndalchau commented 9 years ago

@scottstephens : Thanks for the clarification. Yes, the implementation of LM I've previously used optionally allows the provision of a Jacobian. If you don't provide it, I guess it reconstructs. I'd forgotten about this. There are also the Simplex methods, which are interesting and useful, and would be in the gradient-free category. For example, Nelder-Mead. Something to bear in mind. I'd be very happy to test out preliminary implementations of gradient-free optimizers if/when you have something. Just let me know, and point me to the relevant place.

grovesNL commented 9 years ago

@scottstephens: I believe that would be suitable for my application as well. In particular I am interested in non-linear least squares optimization, for which either of the methods you discussed should be suitable.

In general, the API looks very reasonable and seems to make sense. The current objective function interface appears to be flexible enough to allow for the optimization of non-linear piecewise functions, which was one of my largest concerns.

scottstephens commented 9 years ago

@joemoorhouse : Finally got a chance to look at your suggestion for merging IEvaluation and IObjectiveFunction into a single interface. I like the concept, it's always bothered me that there were two classes involved with implementing an objective function. I took a stab at fleshing it out: https://github.com/scottstephens/mathnet-numerics/tree/unified_optimization_single_objective_interface

Let me know what you think.

joemoorhouse commented 9 years ago

Hi @scottstephens: yes, the IEvaluation interface looks good to me. My main comment is semantic: I would suggest changing the name ;). I would suggest using IEvaluator instead (or IObjectiveFunction). This concentrates on the fact that the object is able to provide the function value, gradient or Hessian rather than the fact that it stores it.

For example it feels a little odd to me to pass an evaluation to a FindMinimum method (this is an actual value), whereas it makes sense to pass an evaluator:

public MinimizationOutput FindMinimum(IEvaluation objective, Vector initial_guess) then becomes: public MinimizationOutput FindMinimum(IEvaluator objective, Vector initial_guess)

We know that the object that implements the interface will most commonly cache the value, gradient and Hessian so that these are only recalculated when the point changes, but (I think the idea is) that the minimiser should not really care about this; it just wants the values. The interface forces the minimiser to change the point separately from getting the values, which encourages less frequent point changes, but that is all.

This might imply that CreateNew() is then unnecessary I think (you just use the IEvaluator that is passed in rather than having to make a new one), although I can see that a Clone() method might be useful (which is basically the same thing)?

Again, just some thoughts, happy to go with the consensus. And clearly not a massively big deal!

cdrnet commented 9 years ago

I've started to catch up with this again, see the updated unified_optimization branch in this repo.

tibel commented 9 years ago

@cdrnet Great to see progress in this area.

I have two questions:

What is the plan on integrating this into master branch?
What are the other optimization branches (optimization-1, optimization-2, and optimization-3) for?

cdrnet commented 9 years ago

@tibel

What is the plan on integrating this into master branch?

The API has not stabilized enough yet, so either we delay merging into master or we explicitly opt-out from semantic versioning for this section, e.g. by marking it as [Obsolete("Preliminary API. Breaking changes between minor releases expected.")]

What are the other optimization branches (optimization-1, optimization-2, and optimization-3) for?

These were earlier prototypes. The idea of this unified_optimization branch is to combine these 3 branches (and also a few others not pulled into mathnet yet) and their ideas into one unified and consistent API. Some of them have working implementations that still need to be migrated over into this branch, and one of them is leveraging MKL already.

tibel commented 9 years ago

@cdrnet Maybe adding a task list to this issue would be good. So it can be tracked what has to be done (e.g. migrate MKL impl from branch xxx or stabilize api).

grovesNL commented 9 years ago

Any ideas on a path forward for this? It seems like it's mostly semantics at this point. Personally I think IEvaluator or IObjective would make sense in this context, and shouldn't cause too much confusion in the future. Either way it would be good to have a consensus on this so we can stabilize the API.

Which pieces are left to merge into the unified_optimization branch? Are we only missing the MKL implementation?

cdrnet commented 9 years ago

I've started merging algorithm implementations from the other prototype branches into the unified one, but this is still work in progress, i.e. there's currently some duplication.

eriove commented 9 years ago

@ndalchau There is now an implementation of Nelder-Mead simplex algorithm that you can try.

@cdrnet Is there anything I can help with to enable you to merge this into master branch? As @tibel suggested a task list would be useful, even if it doesn't contain everything that needs to be done, it still makes it easier to help.

cdrnet commented 9 years ago

Blocking merging into master:

[x] Blocked until v3.8 is released
[ ] One of the BfgsMinimizer unit tests currently fails (TestBfgsMinimizer.FindMinimum_BigRosenbrock_Hard).

Non-blocking remaining tasks, off the top of my head:

[ ] The whole optimization provider story is still quite open (mostly to leverage MKL in a native provider, transparently with managed fallback).
[ ] The objective function design needs more work. I did some experimental refactoring but am not happy yet with the current design.
[ ] There are some duplication left-overs from the prototype merge e.g. regarding line search. Also the API is still quite inconsistent.
[ ] At least basic documentation for the website
[ ] It would be great if we could collect a few pseudo-real world use cases/examples to verify the API usability.
[ ] We usually have a facade class in the root namespace for simple yet common use cases. We should add one for this as well.

eriove commented 8 years ago

Resurrecting this old issue. Found the bug in TestBfgsMinimizer.FindMinimum_BigRosenbrock_Hard. It was a typo. The correct result should have been 100 and not 1. See PR #391

eriove commented 8 years ago

If there is still an interest to use this branch I can try to combine the 3 line search classes (have to start working to be sure if a common base class or just an option is the best way to go).

I could also try to combine as much as possible of the BfgsSolver, BfgsBMinimizer and BfgsMinimizer. As I understand it the functional difference between BfgsBMinimizer and BfgsMinimizer is that BfgsBMinimizer is bounded while BfgsMinimizer isn't?

cdrnet commented 8 years ago

Yes, I'm still interested in this branch and looking forward to finally integrate it into mainline.

Currently it is still way too messy, mostly because of me merging two different but somewhat overlapping spikes. Would be great if we could cleans this up and finally move forward. Regarding BfgsBMinimizer, yes, that is my understanding as well (but I was not directly involved).

eriove commented 8 years ago

I'll start working on this.

scottstephens commented 8 years ago

I've been absent from this thread for a while; I'm sorry about that. However, I've got some newly acquired day-job motivation to add finite-difference support to the library, so I'll be working on that in the days to come. Along the way I'd like to improve the testing and documentation of the library, which I think should get us pretty close to being able to merge this into master.

Besides having fresh knowledge of the code, I will have unusually high availability over the next three weeks or so; if there's anything I can be helpful with on getting this ready to merge, the near future is a good time to ask me for it.

cdrnet commented 8 years ago

That's great news, we shouldn't miss this opportunity!

I've just created an Optimization Project, maybe this can help to move forward (about projects). I might need to add you to the project though so you can add cards and move them around.

tibel commented 7 years ago

@cdrnet With the latest merge I think this case and #173 can be closed right? Or am I missing something?

cdrnet commented 7 years ago

Yes, indeed!

mathnet / mathnet-numerics

Merge Optimization Efforts #283