lmfit / lmfit-py

Non-Linear Least Squares Minimization, with flexible Parameter settings, based on scipy.optimize, and with many additional classes and methods for curve fitting.
https://lmfit.github.io/lmfit-py/
Other
1.07k stars 275 forks source link

fitting several datasets with common parameters #217

Closed hstrey closed 9 years ago

hstrey commented 9 years ago

In my work, I need to often fit several datasets at once where some parameters are shared between datasets and some vary. I created an ipython workbook that illustrates the idea (it is at the bottom): https://github.com/hstrey/bme502/blob/master/class%20notebooks/lmfit.ipynb

I was wondering whether anyone has thought of implementing something like this in a more formal way. For example, the input for x and y could be 2-d arrays, and parameters could be either varying across datasets or be fixed. I would be interested in incorporating this functionality into the Model class. I have some solutions for using several datasets with lmfit.minimize. In short, one calculates the residuals as a matrix and returns the flattened matrix.

Let me know whether there is an easier approach or solutions. Otherwise, I am happy to think more about a formal implementation if there is interest.

Helmut

tacaswell commented 9 years ago

This sounds like it is starting move toward complex modeling. diffpy ( http://www.diffpy.org/ ) might be a more fleshed out framework for that at this point.

newville commented 9 years ago

Hi Helmut,

I'm replying to the lmfit mailing list as well as Github Issues -- we try to use the mailing list for general questions and Issues for actual issues with the code.

On Sun, Mar 8, 2015 at 11:10 AM, Helmut H. Strey notifications@github.com wrote:

In my work, I need to often fit several datasets at once where some parameters are shared between datasets and some vary. I created an ipython workbook that illustrates the idea (it is at the bottom): https://github.com/hstrey/bme502/blob/master/class%20notebooks/lmfit.ipynb

I was wondering whether anyone has thought of implementing something like this in a more formal way. For example, the input for x and y could be 2-d arrays, and parameters could be either varying across datasets or be fixed. I would be interested in incorporating this functionality into the Model class. I have some solutions for using several datasets with lmfit.minimize. In short, one calculates the residuals as a matrix and returns the flattened matrix.

I don't know if anyone has given a lot of thought to a general solution for fitting multiple data sets. It's definitely worth thinking about how it might be improved.

Off the top of my head, I would imagine that creating a "DataSet" class that held the data and had a Model could be useful for general curve fitting problems. Then, one might have a CurveFit object that contained a set of Parameters and many DataSets,each with data arrays and its own Model. For a fit to a single data, this would be overkill, but using Model for multiple data sets does not scale very well. I think such an approach would not be hard, it's just a matter of what we actually want it to look like. I'm sure we can could up with something workable.

Let me know whether there is an easier approach or solutions. Otherwise, I am happy to think more about a formal implementation if there is interest.

This is worth thinking about.

--Matt

andyfaff commented 9 years ago

Helmut and lmfit workers, I have exactly this requirement, the need to fit several datasets simultaneously, with joint parameters. My interest is in the simultaneous fitting of multiple (contrasts) of Neutron and X-ray scattering patterns.

I have already written code to do this, contained in the curvefitter.py file in the refnx project:

https://github.com/andyfaff/refnx/blob/master/refnx/analysis/curvefitter.py

The GlobalFitter class is the bit that does all the work. It takes a list of refnx.analysis.CurveFitter instances, plus a list of constraints that can be used to link parameters across datasets.

The refnx.analysis.CurveFitter class extends lmfit.Minimizer, and is designed to fit an individual dataset. The CurveFitter class is designed so that functions of the form fitfunc(xdata, params, *fcn_args, **fcn_kws) can be used (alternately you can override the model method). fitfunc returns the model, not the residuals. In many respects CurveFitter is similar to lmfit.Model, except that I had the requirement that the number of parameters for a given model should be arbitrary, preventing introspection.

The GlobalFitter works well for the refnx project, but I would love to have something upstream in lmfit. I'm very keen on contributing to the creation of this kind of functionality in the lmfit project.

On 9 March 2015 at 08:02, Matt Newville notifications@github.com wrote:

Hi Helmut,

I'm replying to the lmfit mailing list as well as Github Issues -- we try to use the mailing list for general questions and Issues for actual issues with the code.

On Sun, Mar 8, 2015 at 11:10 AM, Helmut H. Strey <notifications@github.com

wrote:

In my work, I need to often fit several datasets at once where some parameters are shared between datasets and some vary. I created an ipython workbook that illustrates the idea (it is at the bottom):

https://github.com/hstrey/bme502/blob/master/class%20notebooks/lmfit.ipynb

I was wondering whether anyone has thought of implementing something like this in a more formal way. For example, the input for x and y could be 2-d arrays, and parameters could be either varying across datasets or be fixed. I would be interested in incorporating this functionality into the Model class. I have some solutions for using several datasets with lmfit.minimize. In short, one calculates the residuals as a matrix and returns the flattened matrix.

I don't know if anyone has given a lot of thought to a general solution for fitting multiple data sets. It's definitely worth thinking about how it might be improved.

Off the top of my head, I would imagine that creating a "DataSet" class that held the data and had a Model could be useful for general curve fitting problems. Then, one might have a CurveFit object that contained a set of Parameters and many DataSets,each with data arrays and its own Model. For a fit to a single data, this would be overkill, but using Model for multiple data sets does not scale very well. I think such an approach would not be hard, it's just a matter of what we actually want it to look like. I'm sure we can could up with something workable.

Let me know whether there is an easier approach or solutions. Otherwise, I am happy to think more about a formal implementation if there is interest.

This is worth thinking about.

--Matt

— Reply to this email directly or view it on GitHub https://github.com/lmfit/lmfit-py/issues/217#issuecomment-77774078.


Dr. Andrew Nelson


newville commented 9 years ago

Hi Andrew,

...and I'm replying to this on the lmfit-py mailing list as well.... Can we please use the mailing list for this sort of discussion and leave Github Issues for actual Issues with the code?

On Sun, Mar 8, 2015 at 7:17 PM, Andrew Nelson notifications@github.com wrote:

Helmut and lmfit workers, I have exactly this requirement, the need to fit several datasets simultaneously, with joint parameters. My interest is in the simultaneous fitting of multiple (contrasts) of Neutron and X-ray scattering patterns.

I have already written code to do this, contained in the curvefitter.py file in the refnx project:

https://github.com/andyfaff/refnx/blob/master/refnx/analysis/curvefitter.py

The GlobalFitter class is the bit that does all the work. It takes a list of refnx.analysis.CurveFitter instances, plus a list of constraints that can be used to link parameters across datasets.

The refnx.analysis.CurveFitter class extends lmfit.Minimizer, and is designed to fit an individual dataset. The CurveFitter class is designed so that functions of the form fitfunc(xdata, params, *fcn_args, **fcn_kws) can be used (alternately you can override the model method). fitfunc returns the model, not the residuals. In many respects CurveFitter is similar to lmfit.Model, except that I had the requirement that the number of parameters for a given model should be arbitrary, preventing introspection.

The GlobalFitter works well for the refnx project, but I would love to have something upstream in lmfit. I'm very keen on contributing to the creation of this kind of functionality in the lmfit project.

Yes, I think that the lmfit.Model class is a good beginning, but not a complete solution to Curve Fitting. I think much of the discussion about "state" over the past few months is probably related to this: The abstract Model should be stateless, but a Curve Fitting problem is absolutely not stateless -- there are concrete data, Parameters, and Model(s). Having a CurveFitter class would probably make the statelessness of Model less confusing.

After a quick look, I'd be in favor of a CurveFitter class in lmfit that was similar (but perhaps generalizing the data) to your CurveFitter. I'd think it should be possible to seamlessly handle multiple data sets by using a list Datasets/Models -- you think that could work? Do you think inheriting from Minimizer is preferred to "having a" Minimizer? Since re-working Minimizer to better handle the parameter copying and state is on the To-Do list, maybe that can be done in a way to work well for a CurveFitter class.

--Matt

newville commented 9 years ago

use mailing list!

newville commented 9 years ago

thanks for the prompt reply. I like the idea of a DataSet Object and that each dataset can be associated with a different model (but with potentially shared parameters). That is exactly what I need. The tricky part is to figure out a general way to keep track of which parameters are shared between different models. I will look in the GlobalCurveFitter code and think about how this could be implemented.

Helmut

On Sunday, March 8, 2015 at 5:02:55 PM UTC-4, Matt Newville wrote:

Hi Helmut,

I'm replying to the lmfit mailing list as well as Github Issues -- we try to use the mailing list for general questions and Issues for actual issues with the code.

On Sun, Mar 8, 2015 at 11:10 AM, Helmut H. Strey <notifi...@github.com

> wrote: > In my work, I need to often fit several datasets at once where some > parameters are shared between datasets and some vary. I created an ipython > workbook that illustrates the idea (it is at the bottom): > https://github.com/hstrey/bme502/blob/master/class%20notebooks/lmfit.ipynb > > I was wondering whether anyone has thought of implementing something like > this in a more formal way. For example, the input for x and y could be 2-d > arrays, and parameters could be either varying across datasets or be fixed. > I would be interested in incorporating this functionality into the Model > class. I have some solutions for using several datasets with > lmfit.minimize. In short, one calculates the residuals as a matrix and > returns the flattened matrix. > > I don't know if anyone has given a lot of thought to a general solution > for fitting multiple data sets. It's definitely worth thinking about how > it might be improved. Off the top of my head, I would imagine that creating a "DataSet" class that held the data and had a Model could be useful for general curve fitting problems. Then, one might have a CurveFit object that contained a set of Parameters and many DataSets,each with data arrays and its own Model. For a fit to a single data, this would be overkill, but using Model for multiple data sets does not scale very well. I think such an approach would not be hard, it's just a matter of what we actually want it to look like. I'm sure we can could up with something workable. > Let me know whether there is an easier approach or solutions. Otherwise, > I am happy to think more about a formal implementation if there is > interest. > > This is worth thinking about. --Matt