psyplot / psy-reg

Psyplot plugin for visualizing and calculating regression plots
GNU Lesser General Public License v3.0
1 stars 1 forks source link

Using scipy's genetic algorithm for initial parameter estimation in curve_fit() #1

Closed zunzun closed 6 years ago

zunzun commented 6 years ago

In your Python code for curve_fit() you are not specifying the p0 starting parameters for the Levenberg-Marquardt solver, and so curve_fit() is using scipy's default initial parameter values of all 1.0. This can be suboptimal, and in more complex equations often results in the algorithm finding a local minimum in error space. For this reason, the authors of scipy added a genetic algorithm for initial parameter estimation. The module is named scipy.optimize.differential_evolution.

I have used scipy's Differential Evolution genetic algorithm to determine initial parameters for fitting a double Lorentzian peak equation to Raman spectroscopy data and found that the results were excellent. The GitHub project, with a test spectroscopy data file, is:

https://github.com/zunzun/RamanSpectroscopyFit

If you have any questions, please let me know.

James Phillips

Chilipp commented 6 years ago

Hi @zunzun,

Thanks for that helpful comment! I anyway planned to implement a formatoption for the initial values, and it would be nice to implement your methodology. Could you say anything about the performance of the differential_evolution function? This becomes an important issue when combined with the bootstrapping to estimate the uncertainty interval.

zunzun commented 6 years ago

Relative to other genetic algorithms, Differential Evolution generally seems to give very good results in the shortest time - which is why it was added to scipy. Given the choice between a fast, bad result and a somewhat slower correct result, most people would choose the correct result.

As an quick experiment, try running my Raman Spectroscopy example without passing initial parameter values to curve_fit() - the results are totally useless.

Chilipp commented 6 years ago

I agree. Thanks again! I plan to implement it during this week and keep you informed about the progress.

zunzun commented 6 years ago

Cool.

Chilipp commented 6 years ago

Hi @zunzun ,

I implemented two formatoptions: the p0 formatoption that, by default, estimates the initial parameters automatically and the param_bounds formatoption to specify the boundaries for the function parameters (required by the differential_evolution function). These boundaries are then also used for the curve_fit call.

Does this solve your issue? Please let me know if you have any suggestions.

Best, Philipp

zunzun commented 6 years ago

Most excellent, even better than I had hoped. I see you put some serious thought into this, the new options are really cool. Thank you for making these design changes so quickly.

Chilipp commented 6 years ago

You're welcome, thanks