lmfit / lmfit-py

Non-Linear Least Squares Minimization, with flexible Parameter settings, based on scipy.optimize, and with many additional classes and methods for curve fitting.
https://lmfit.github.io/lmfit-py/
Other
1.05k stars 272 forks source link

Misleading documentation for Minimize.minimize - Minimize.params is not the same Parameters instance used to initialise the Minimizer. #174

Closed andyfaff closed 8 years ago

andyfaff commented 9 years ago

Lets consider the following code:

import numpy as np
from lmfit import minimize, Parameters, Parameter

x = np.linspace(0, 15, 301)
data = (5. * np.sin(2 * x - 0.1) * np.exp(-x*x*0.025) +
        np.random.normal(size=len(x), scale=0.2) )

# define objective function: returns the array to be minimized
def fcn2min(params, x, data):
    """ model decaying sine wave, subtract data"""
    amp = params['amp'].value
    shift = params['shift'].value
    omega = params['omega'].value
    decay = params['decay'].value

    model = amp * np.sin(x * omega + shift) * np.exp(-x*x*decay)
    return model - data

# create a set of Parameters
params = Parameters()
params.add('amp',   value= 10,  min=0)
params.add('decay', value= 0.1)
params.add('shift', value= 0.0, min=-np.pi/2., max=np.pi/2)
params.add('omega', value= 3.0)

# do fit, here with leastsq model
result = minimize(fcn2min, params, args=(x, data))

print result.params is params

The last line prints False. i.e. result.params is not the same instance as the params that was used to construct the Minimizer object. This is because in Minimizer.unprepare_fit there is a deepcopy:

self.params = deepcopy(self.params)

I would naively expect

result.params is params

to be True. Why is a copy done in unprepare_fit? Is it done to keep the initial Parameters object unchanged? If so, then it would be better to do the deepcopy before Minimizer.minimize is called. For example you could have the confusing behaviour:

obj = Minimizer(fcn2min, params, args=(x, data))
#prints True
obj.params is params
obj.minimize()
#prints False
obj.params is params
newville commented 9 years ago

@andyfaff: What is the Issue? In the email I received about this Issue, it had:

In http://lmfit.github.io/lmfit-py/fitting.html the documentation says that "On output, the params 
will be updated with best-fit values"

However, lets consider the following code:

which (it appears you edited that it) now reads:

Lets consider the following code:

The title for this Issue clearly says the Documentation is misleading, and it is mentioned in the email I received, but this is no longer mentioned in the text. All that remains in the text is the (correct) observation that results.params is a copy of params, and some confusion about why that is.

I don't see documentation as being misleading. The parameters are updated with best-fit values on output. That section of the doc does not mention results.params at all.

You're absolutely right that result.params is a different object from params. This is by design, and after some discussion. The point is that result (the return value from minimize()) and holds the result of a fit. It is considered "done" (unless told otherwise). In this way one can do a fit, make some changes (say, parameter bounds, fitting methods, etc), do another fit (now using the updated parameters from the previous fit), and be able to compare the results. If result.params is params, the older parameter values would be lost and the fits could not be compared.

The copy is done after the fit (intentionally) so that both params and result.params have the best-fit values.

tacaswell commented 9 years ago

I agree with @andyfaff that if the scheme is to mutate the input, then I would expect the returned Parameter object to be the input object.

This does raise the better question of why minimize mutates input values which is very contrary to my expectations of how functions behave. This seems like the sort of state-management that was going to be moved away from after the prior discussion.

I have put in a PR (#175) to make it very hard to miss this fact.

andyfaff commented 9 years ago

When I originally submitted the issue I was looking at the minimize function documentation, but working with the Minimizer class. So I edited the issue. That's probably the source of the confusion. I'll explain further where I see the potential for confusion in a separate comment.

andyfaff commented 9 years ago

Actually, in my original message I made things worse because the example I presented used the minimize function, rather than demonstrating the confusing behaviour I was trying to understand. Sorry for muddying the water. There are actually two issues here I think. In this comment I'll address the first.

In the example above I used the minimize function. As @tacaswell says minimize mutates the input. Normally I would expect the values in the input params to remain unchanged. As such the params should be deepcopy'd before its values are changed. Either in __init__ of the Minimizer object, or in the minimize function.

Because the values in the input params were changing one can jump to the erroneous conclusion that

result.params is params

So which way is correct? 1) If the input params are going to be mutated/changed I think that

result.params is params

should be true. In which the deepcopy in unprepare_fit shouldn't happen.

2) If the input params are going to be unchanged then deepcopy should occur in __init__, before any values are changed.

Unfortunately, I don't think that #175 is the right way of going about it.

andyfaff commented 9 years ago

The second issue is related. I came across this issue when extending Minimizer and it took a lot of time to understand what was going on. Here is my logic. 1) Initialise the Minimizer (mini) with a Parameters instance (params). 2) At this point mini.params is params is True. 3) Now do a minimization, mini.minimize(). 4) At this point the values in params have been changed. The values in mini.params are the same. At this point I assumed that params and mini.params are the same instance. This assumption seems reasonable because the values are the same. However they are not the same instance.

I find it confusing that mini.params may or may not be params depending on whether a fit has been done.

I think it would be less confusing if the deepcopy was performed when initialising the Minimizer instance (and the deepcopy removed from unprepare_fit). Here mini.params will remain the same instance for the life of the Minimizer instance, rather than changing when the minimize method is called.

Alternatively one could initialise the Minimizer instance with params and the mini.params attribute could remain the params instance for the lifetime of the Minimizer instance.

newville commented 9 years ago

@andyfaff @tacaswell -- we've had these discussions before. Unfortunately, we keep using github Issues for such discussions about functionality so finding old discussions is hard.

So, I've re-opened the issue, and will respond to the specifics separately. But I am very reluctant to change this behavior without being strongly convinced to do so. And part of that convincing will involve PRs for code and doc, and explanation on the mailing list that are more explicit than simply referencing github Issues.

andyfaff commented 9 years ago

I'm gathering that the mailing list is the main location for such discussions then. (don't forget I'm new to the lmfit community). I've got no problems with making PR's for both the code and documentation.

newville commented 9 years ago

@andyfaff, @tacaswell : The docs do state that params will have its values updated. Andrew pointed to this exact part of the doc in his original message. I'm all in favor of making the docs better. But this behavior is actually documented. Why anyone would read that and expect params to be identical to another object result.params is beyond me. It says nothing of the sort!

From the Python point of view, params is derived from an OrderedDict, so it is mutable. A function that receives it as an argument can (and may) alter the values of mutable types. Call-by-object is "normal" Python behavior. Expecting functions to never change mutable arguments is going to lead to disappointment.

But this is not the issue raised! The Issue is not that params was altered, but that result.params is not params. That is, the issue actually raised (and still the point of @andyfaff's comments) is about object identity, not whether the values of that mutable object have been changed.

Is there a use case where comparing the identity of result.params and params is interesting? Like, why would anyone "jump to the conclusion" that two dictionaries are identical? What would cause anyone to care about the identity of params and result.params? I find it doubly odd that Andrew cares about the implementation detail of when a copy happens: It happens after the fit.... so what?

What's more, result.params is created as part of the output return value. Why would anyone ever expect a part of the data structure returned by a function to be identical to one of the function arguments? If these were identical, what would be the point in even having a result.params at all?

So, basically, @andyfaff , why is it that you're even testing result.params is params?

I see good reasons (explained above, discussed earlier) for the current behavior, and I'm not really seeing a reason to change it. If you still think this should change, you need to make a very good case for testing whether these two different objects are identical and not merely (and possibly temporarily) equal. I think you'll need to cover the cases of comparing the results of two successive fits in detail, and explain how that would work.

andyfaff commented 9 years ago

But this is not the issue raised! The Issue is not that params was altered, but that result.params is not params. That is, the issue actually raised (and still the point of @andyfaff's comments) is about object identity, not whether the values of that mutable object have been changed.

The documentation for the minimize method states:

On output, the params will be updated with best-fit values

I take this to mean that the original params used to initialise the Minimizer instance are changed. But consider the following piece of code where it really does matter that the mini.params attribute remains the same instance of Parameters:

mini = Minimizer(fcn2min, params, args=(x, data))
mini.minimize(method='differential_evolution')
#at this point mini.params and params have the same content, but are different instances
#now lets do a 'polishing' run.
mini.minimize(method='leastsq')
#now params and mini.params are different.

At the end of this code block mini.params and the original params instance used to initialise the Minimizer instance will contain different values. This is because on the second call to mini.minimize the original params instance is no longer updated. My take on this is that the behaviour is in disagreement with the documentation line above.
A common mode of usage may be to make several calls to the minimize method on the same Minimizer instance, e.g. with scalar_minimize, leastsq, (and hopefully Markov Chain Monte Carlo). You may hold a reference to mini.params somewhere else. Everytime you do a minimize the reference you are holding becomes stale. With a class based minimizer like this I would expect mini.params to remain the same instance of Parameters for the lifetime of the mini instance, even if the values in that OrderedDict change after every mini.minimize invocation.

What's more, result.params is created as part of the output return value. Why would anyone ever expect a part of the data structure returned by a function to be identical to one of the function arguments? If these were identical, what would be the point in even having a result.params at all?

It's important to note that the behaviour I'm describing above is for the Minimizer.minimize method, not the minimize function. The minimize method only returns True/False.

However, I recognise that the behaviour of the minimize function should be different. I don't expect the the params in that data structure returned by that function to be identical to the params argument.

What I propose is to remove the deepcopy in unprepare_fit. This means that when you use a Minimize object the mini.params attribute remains the same instance of OrderedDict for the lifetime of mini. mini.params is the same instance as the params used to instantiate mini. At the same time the minimize function would change to:

def minimize(fcn2min, params ...):
    mini = Minimizer(fcn2min, params)
    mini.minimize()
    params_copy = deepcopy(params)
    mini.params = params_copy
    return mini

I hope this clarifies things.

newville commented 9 years ago

@andyfaff wrote:

 I hope this clarifies things.

No. It does not.

You have not answered the fundamental question -- why is the identity of a dictionary important? Your original issue and every subsequent message focused on this as the basic Issue. You have not ever explained why this is important. You have wasted a lot of time with this.

have only continued your elaborate your observation that they are different, and that the behavior is different from what you expect. Furthermore you keep changing what you describe as "The Issue". The first question I asked when you started this conversation was "What is the Issue". I still do not fully understand. I asked you very specifically to answer some questions and you have not done that. At this point, I'm more concerned about the poor communication than any perceived issue.

As I have explained, there are reasons for the current behavior. "Different from what someone expects" is not actually an Issue. Is there a problem.

You have not provided a simple test case that shows "the Issue" -- the one you posted originally uses the minimize(), and now you're saying that the behavior is only when using Minimize.minimize(). You keep changing the issue you're talking about.

With your partial example:

    mini = Minimizer(fcn2min, params, args=(x, data))
    mini.minimize(method='differential_evolution')
    #at this point mini.params and params have the same content, but are different instances
    #now lets do a 'polishing' run.
    mini.minimize(method='leastsq')
    #now params and mini.params are different.

(which doesn't run, of course) you say:

At the end of this code block mini.params and the original params instance used to initialise the
Minimizer instance will contain different values. This is because on the second call to mini.minimize 
the original params instance is no longer updated. My take on this is that the behaviour is in
disagreement with the documentation line above.

The doc you point to describes the minimize() function, not Minimize.minimize().

But, if you're using the Minimizer class, you can treat the params passed in at creation as input. Each time the minimize() method (or, really unprepare_fit()) is run, the old set of parameters will be updated, and Minimizer.params will get reset to a new Parameters instance with the latest values.

It is true that the copy happens in unprepare_fit so that after Minimizer.minimize() the previous version of Minimizer.params will be updated. Thus, the previous version and the latest version Minimizer.params are equal but not identical. The docs are actually not specific on this behavior. It would be possible to move the copy from unprepare_fit to prepare_fit. That would leave the passed in params unchanged, and not update the previous version between runs of minimize().

Perhaps that is what you're confused by / unhappy with. If that is what you had described as the Issue and proposed such a change, I'd probably be in agreement, but cautious. To be clear, it is not at all the Issue you describe - your focus on identity is completely unimportant identity of Parameters instances. It is also not your proposed changed. Your proposal is to remove the copy from the Minimizer class. That would not be a good idea.

Sorry if I sound grumpy....

danielballan commented 9 years ago

@newville I wish you had paused before posting that response. I feel like you lost track of the fact that @andyfaff has devoted a chunk of his holiday vacation to improving lmfit. Whatever you think of his contributions, the scolding tone of some of your remarks is not an appropriate response to a volunteer. I suggest we let this issue cool for a couple days and revisit.

danielballan commented 9 years ago

@andyfaff Above, Matt alluded to an earlier discussion on this copy-on-input issue. It was in #56. His key points are in this comment. I agree with some of those points, in particular that changing to copy-on-input would break a lot of old user code. The original poster of that issue suggested working around the problem by implementing a new class, Model. Which, of course, we did.

At the time, Matt suggested adding warnings about the copying behavior in 0.8 and switching to fully copy-on-input in version 0.9:

I wonder if it might reasonable to implement copy-on-output (and reset() and copy() [methods for Parameter objects]) for 0.8.0, and change all the examples and docs to use Minimizer.params instead of the params argument, and deprecate this usage so that with 0.9.0 params becomes copy-on-input? I've never thought that far in advance about API changes or version numbers, but perhaps this is the right approach.

As far as I know we never revisited that discussion, focusing instead on fleshing out Model.

andyfaff commented 9 years ago

@danielballan, I've just read (and re-read) through #56 to make sure I understood what was discussed at that time. To recap the original issues boiled down to usage of the minimize function. There were two calls to minimize, each had used the same params instance as the initial guesses, but the data was different in each case. The output from each of those calls was out1 and out2. The problem was that out1.params and out2.params were the same, even though the data in the two fits were different. This was a consequence of the same params instance being reused throughout the call chain. In other words params, out1.params and out2.params were the same instance. These statements:

params is out1.params
params is out2.params
out1.params is out2.params

would have all been True in the original issue. The solution was to perform a deepcopy of params, but such that:

params is altered to have the best values, correlations, and so forth

and

out1.params is a copy of this, so that later work with params does not alter the fitter results.

This deepcopy operation was added to the underlying Minimizer.unprepare_fit. The reason it was added there is so the original input params could be altered to have the best values. However, it could have also been implemented as follows:

 def minimize(func, params, *args, **kwds):
      output = Minimizer(func, params, *args, **kwds)
      #do the minimization on the original params so they're updated.
      #here we are assuming that the deepcopy in unprepare_fit is not done.
      output.minimize()

     #now copy so that output.params and params are not the same instance
      par_copy = deepcopy(output.params)
      output.params = par_copy
      return output

This implementation still has the overall effect of modifying the original params, but the output.params is a copy, so that future work with params does not alter the output.params.

I totally agree that this is the desired type of functionality for the minimize function and we wouldn't want to do anything to change this behaviour whatsoever. Where I do feel there is room for improvement is in the Minimizer class. Take the following example:

import numpy as np
from lmfit import Parameters, report_fit, Minimizer

x = np.linspace(0, 1, 100)
y1 = 1.0 * np.exp(1.0 * x)

def residual(params, x, data):
    a = params['a'].value
    b = params['b'].value
    model = a * np.exp(b * x)
    return (data - model)

params = Parameters()
params.add('a', value = 2.0)
params.add('b', value = 2.0)

fitter = Minimizer(residual, params, fcn_args=(x, y1))

#do an initial fit with something other than leastsq
fitter.minimize(method='nelder-mead')

print "\n params from first fit"
report_fit(params)

#now use leastsq to polish
fitter.minimize(method='leastsq')

print "\n params from polishing"
report_fit(params)

This results in:

params from first fit
[[Variables]]
    a:   0.99999317 (init= 2)
    b:   0.99998273 (init= 2)
[[Correlations]] (unreported correlations are <  0.100)

params from polishing
[[Variables]]
     a:   0.99999317 (init= 2)
     b:   0.99998273 (init= 2)
 [[Correlations]] (unreported correlations are <  0.100)

You'll notice:

  1. The first fit changes the original params instance.
  2. That after the polishing fit no uncertainties appear to have been added to the params instance.
  3. The report says that the initial values were 2 for the second polishing fit.

If I had done the fit report using the fitter.params instance we would seen different results. The reason behind this is the deepcopy done in fitter.unprepare_fit. Everytime one calls the fitter.minimize method the fitter.params instance changes. Obviously, I would expect the values inside the fitter.params instance to change, but in this case it's a totally new object. In a class based minimizer, such as Minimizer I would expect the fitter instance attributes created during initialisation to point to the same object throughout the object lifetime. There are several examples for why this is important:

import numpy as np
from lmfit import Parameters, report_fit, Minimizer

x = np.linspace(0, 1, 100)
y1 = 1.0 * np.exp(1.0 * x)

def residual(params, x, data):
    a = params['a'].value
    b = params['b'].value
    model = a * np.exp(b * x)
    return (data - model)

params = Parameters()
params.add('a', value = 2.0)
params.add('b', value = 2.0)

fitter = Minimizer(residual, params, fcn_args=(x, y1))

#fit the original dataset
fitter.minimize()

report_fit(fitter.params)

#let's simulate the data updating from time to time
y1 *= 2

#lets try fitting the data again 
fitter.minimize(method='leastsq')
report_fit(fitter.params)

This results in:

[[Variables]]
a:   1          +/- 0        (0.00%) (init= 2)
b:   1          +/- 0        (0.00%) (init= 2)

[[Variables]]
a:   2          +/- 0        (0.00%) (init= 1)
b:   1          +/- 0        (0.00%) (init= 1)

Here the underlying data is changing but we can reuse the fitter instance after the update. This is because the fitter.userargs attribute always points to the same arrays. If we had deepcopied the elements in fcn_args when initialising fitter then we would not be able to reuse fitter after the data had changed. When I originally created this issue I was experiencing problems in a GUI where I storing all the params instances in a dictionary, displaying and editing them in a GUI. On creating a Minimizer and then calling fitter.minimize several times I was noticing that the relevant params _was updating the first time, but not after that_. This behaviour is inconsistent. The only way to get around this was to overwrite the entry in P after every fit:

params = P['dataset1']
fitter = Minimizer(residual, params, fcn_args=(x, y1))
fitter.minimize()
P['dataset1'] = fitter.params

The last line should not be necessary. As far as I can tell there are no drawbacks from removing deepcopy in fitter.unprepare_fit. The fitter.params attribute will then always point to the same params instance used to initialize the fitter object. The only thing one has to do is write a slightly modified minimize function so the original issue, #56, does not reappear.

I'm willing to write the PR for this.

danielballan commented 9 years ago

I agree that this particular aspect of Minimizer should be changed, and I think I agree that refactoring minimize and unprepare_fit makes the change effectively without side effects.

A PR with tests will help us all understand the gist of this. Please include tests on minimize and Minimizer, drawing on the examples and those in #56, so we can be sure that most usages are not changed. It would help to run the tests before changing the code, showing which pass and fail before and after.

newville commented 9 years ago

@andyfaff -- I'm still not sure I actually understand the points you're trying to make. They seemed quite confusing to me, and your examples don't really illustrate clear points to me. I'd much prefer to have a real conversations than long monologues that talk past each other, but I'll avoid the temptation of responding to your specific points, especially as I don't really understand them, and have already asked several questions that have gone unanswered. I'm sorry this is as long as it is, and I am beginning to despise github Issues, so I apologize in advance.

I think the basic problem here is that when one does successive fits with the same Minimizer.minimize or minimize function, one may one to save intermediate results, and have the best-values-so-far be used as starting values for the next fit. But If one wants to keep a copy of the parameters for successive fits, I don't see how it's possible to avoid doing a copy of the params at some point. The only questions are where this happens, and whether the user has to do it explicitly.

Again, I'm not completely sure this is what you think the Issue is (you keep going on about object identity....), but that's the only actionable issue I can see here.

Also: I'm not saying that the current behavior for how params are handled in the Minimize class is the best behavior. I even gave a proposed solution (copy on input) earlier in this conversation, as was discussed a year ago too (my preference for mailing list grows with each conversation). At that time copy on output seemed like the least disruptive approach. More options are given below.

I'll skip the majority of your post, and skip to the actionable items:

As far as I can tell there are no drawbacks from removing deepcopy in fitter.unprepare_fit.

The user would have to explicitly copy the parameters if they wanted to keep intermediate results for successive fits. That is a drawback.

The fitter.params attribute will then always point to the same params instance used to initialize the fitter object.

I just don't understand why the identity of fitter.params and params important. The values of those objects are definitely important. But why do you need two names bound to the same object? If anything, that seems like a (probably benign) problem to me. In any event, identity of an object's attributes should be considered an implementation detail.

The only thing one has to do is write a slightly modified minimize function so the original issue, #56, does not reappear.

I think the minimize method and function should have very similar behaviors, and that the function should be a trivial wrapping of the method. Doing otherwise opens up room for incompatibilities and headaches. I see a a few possible approaches to improving what Minimizer.minimize and the minimize function does with params:

  1. leave as is. Users would have to make copies of params between subsequent uses of the Minimizer.mimimze method. This requires no code changes, but clarification of docs and examples.
  2. have copies made only in the minimize function. Copies would still have to be made explicitly by the user when using the minimize method.
  3. Copy params on input (in __init__() or prepare_fit()). This would leave the original parameter values unchanged. A user would still have to make explicit copies to retain intermediate fits. This would potentially break some existing scripts.
  4. Have Minimize.minimize method return an object (MinimizeResult, akin to OptimizeResult or to ModelFit) that would contain statusinformation and a copy of the self.params after the fit. This class could also have a fit_report method, similar to ModelFit (for uniformity, perhaps ModelFit should be renamed ModelResult? I think there's little harm in that). In this option, Minimizer.params would not need to be re-copied internally (which is, I think, partly addresses your concern), and each successive fit would have the last best values as starting values. Minimizer.__init__() could either have the passed in params be copied on input or be identical to self.params. Since we'd be breaking stuff anyway, I would suggest copy-on-input to be better, but could go either way. In this scheme, the minimize function would return the same output, leaving it a trivial wrapping of Minimize.minimize(). This would break some existing scripts, but would also allow for more flexibility in what is included in the return value.

Option 1 is the least work, but clearly confusing. I think Option 2 is confusing too. Option 3 is probably better than 1 or 2, but I don't know that it really solves the core issue, and may be nearly as confusing to the current behavior. My preference would be for Option 4: it is the most work, but probably the best long-term solution. If we're going to break things, we might as well make it worth the effort.

I would be eager to hear opinions on this, and especially whether you think option 4 would satisfy the "Issue" as you see it.

Again, sorry this post was so long.