tBuLi / symfit

Symbolic Fitting; fitting as it should be.
http://symfit.readthedocs.org
MIT License
233 stars 17 forks source link

Global minimizers #155

Closed pckroon closed 6 years ago

pckroon commented 6 years ago

So this adds a Differential Evolution minimizer, which is awesome. I still need to do docstrings, tests and possibly documentation.

You can already look at the code though.

tBuLi commented 6 years ago

So far, I really like what you've done with this one! Looking forward to docs and tests ;).

Just one critical note, I'm not sure I like the factory function for ChainedMinimizer, wouldn't it make more sense to just have __init__ accept a minimizers (keyword) argument? e.g.

def f(x, a, b):
    return a * x + b

def chi_squared(a, b):
    return np.sum((ydata - f(xdata, a, b))**2)

custom_minimize = ChainedMinimizer(chi_squared, [a, b], minimizers=[DifferentialEvolution, SLSQP])
fit_results = custom_minimize.execute()

I think this looks cleaner than

CustomMinimizer = ChainedMinimizer([DifferentialEvolution, SLSQP])
custom_minimize = CustomMinimizer(chi_squared, [a, b])

And I wrote fit_results in plural because I think it would help users for debugging purposes if this method returned all the FitResults, as a list.

As some final icing on the cake, we can have Fit accept a list for the minimizer argument, which we will then simply wrap internally. This would then for most use cases reduce the above code to

x, y = variables('x, y')
a, b = parameter('a, b')

model = {y: a * x + b}
fit = Fit(model, x=xdata, y=ydata, minimizer=[DifferentialEvolution, SLSQP])
fit_results = fit.execute()

minimizer_kwargs could then be passed as

fit_results = fit.execute(DifferentialEvolution={'popsize': 100}, SLSQP={'tol': 1e-9})

(this last part about Fit should perhaps be done as a seperate PR)

pckroon commented 6 years ago

I'm not sure why I made ChainedMinimizer a factory; it made sense at the time. It's a great idea to have Fit accept a sequence of minimizers, and let it deal with the exact details. I'll still implement that in this PR. The minimizer kwargs for fit.execute() I'll leave to you.

I'm not sure I agree with making fitresults a sequence, since thenyou can no longer do model(**fit_results.params). What we (I?) can do is make an object FitResults that behaves like a Sequence, but any attribute access will be delegated to it's last FitResult.

tBuLi commented 6 years ago

Good point about model(**fit_results.params), I forgot about that.

I'm a bit anxious of making a custom sequence object, as I had a bad experience with this in an earlier symfit version ;). Perhaps it could be as simple as adding a .from attribute to a FitResults to indicate it's 'parent'?

Let's meditate on this some more to see if this can be done in an idiomatic way.

tBuLi commented 6 years ago

I just realised that this code breaks the convention that symfit doesn't change the .value of your Parameter's, but returns the best fit value in a FitResults object. This code breaks that by updating the .value argument on the Parameter objects.

Apart from breaking the current convention, I can even see future confused and frustrated users because unbeknownst to them the value is no longer what they said originally but they didn't know that it was changed.

For example: (Plot of graph)

x, y = parameter('x, y')
x.value = -2.5
y. value = -20

model = {y: x**4 - 10 * x**2 - x} # Skewed Mexican hat
fit = Fit(model, minimizer=[DifferentialEvolution, SLSQP])
fit_result1 = fit.execute()

fit = Fit(model)
fit_result2 = fit.execute()

I would expect the first one to find the global minimum, and the second to give the local minimum. But it doesn't, this will give the global minimum twice.

So even though I do like the addition of the setter for initial_guesses, our internal usage of it shouldn't leave a trace. The pythonic way to deal with a situation like this is using a context manager, most sensible to me would be adding __enter__ and __exit__ to ChainedMinimizer and then call it from Fit.execute() as

# This line is in Fit.__init__
self.minimizer = ChainedMinimizer(chi_squared, [a, b], minimizers=[DifferentialEvolution, SLSQP]) 

# In Fit.execute:
with self.minimizer as minimizer:
    minimizer_ans = minimizer.execute(**minimize_options)

__enter__ and __exit__ then contain the logic to restore the initial guesses upon exit.

pckroon commented 6 years ago

I completely agree that the values of parameters should not change during fitting. This is what I get for dusting off old code and making it to a PR quickly. I don't really like having to call fit.execute multiple times though. I'm fitting only once, even if internally it uses multiple minimisation steps. As an alternative to making context managers, how about giving minimizers an _initial_guesses which defaults to None. The getter initial_guesses then gives back _initial_guesses, unless it's None, in which case it'll give the values of the parameters. The setter for initial_guesses can then simply set _initial_guesses

pckroon commented 6 years ago

TODO:

pckroon commented 6 years ago

I made docstrings, and two tests. It should be noted that the tests are a little expensive to run. Any edge case I missed? And do we need additional documentation beyond the docstrings?

tBuLi commented 6 years ago

As for further documentation, a quick description of it's usage on this page is in order for such a shinny feature. That way this feature is better advertised, documented, and googleable.

As a subsection to the desciption of DifferentialEvolution, you could also add a few words on how it's good practice to string minimizers together using the new Fit(..., minimizer=[your minimizers here]) syntax. That also deserves some more attention.

As for tests, I would still like to see my above comment about the skewed Mexican hat potential as a test. This is basically a copy-paste from above and it allows us to discriminate between global and local maxima, such that if we accidentally mess something up in the future with overwriting the guesses, this test will fail.

pckroon commented 6 years ago

As an update, the hyperparameter optimisation has been running for a week now, and still seems to be going strong. I do hope it actually produces meaningful results.

Jhsmit commented 6 years ago

bump for Differential Evolution! Dont know what hyperparameter optimization is but definitely want that too.

pckroon commented 6 years ago

@Jhsmit There's a bunch of parameters that determine the behaviour of the differential evolution algorithm. These are hyperparameters, and these can (should) be optimised for e.g. fitting. @tBuLi I'm also ok with merging this as-is, and putting in the hyperparameters later.

tBuLi commented 6 years ago

I agree that the hyperparameter stuff is an issue for later. (And to some extend also for the user, with a powertool like this some thinking will be required. We just provide a convenient API.)

However, before merging I would also like to change the **minimizer_kwargs to the proposed syntax, and I prefer to make those changes in this PR before merging. I will do this as soon as possible, but due to my day job currently not being an excuse to develop symfit, that might take a while.

pckroon commented 6 years ago

I forgot about that part to be honest. And fair enough, but it's an excellent procrastination project :)

Jhsmit commented 6 years ago

@pckroon That does sound like an amazing feature. I've found myself making for loops over all scipy.optimize.minimize over all different method options and then logging the time spend and the minimizer result. But this is an absolute pain and it takes forever.

Right now I'm using differential evolution, which does give a lower minimize value, but takes way longer. I'm using the default option which works but can't be bothered to spend time optimizing parameters. So yeah definitely would want a nice API for that ;)

This PyGMO also looks interesting.

pckroon commented 6 years ago

Oh you won't get a nice API for optimising hyperparameters (it's rarely worth it anyway). I'm trying to find decent DE options fit for fitting. And yes, DE is slower than local optimisation. But that's inherent.

Jhsmit commented 6 years ago

But won't the decent DE options vary from fitting problem to fitting problem?

pckroon commented 6 years ago

Well yes. But I have the hope I picked a problem that's representative. IIRC I took a gaussian to optimise for.

On Fri, 1 Jun 2018, 16:41 Jochem Smit, notifications@github.com wrote:

But won't the decent DE options vary from fitting problem to fitting problem?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tBuLi/symfit/pull/155#issuecomment-393901859, or mute the thread https://github.com/notifications/unsubscribe-auth/AMXfucwBhCf5Ewb0ZKtzIYtr8rdpDPWXks5t4VKagaJpZM4T388Z .

pckroon commented 6 years ago

Closed due to pending rebase