lmfit / lmfit-py

Non-Linear Least Squares Minimization, with flexible Parameter settings, based on scipy.optimize, and with many additional classes and methods for curve fitting.
https://lmfit.github.io/lmfit-py/
Other
1.04k stars 271 forks source link

scalar minimizer not preserving all best-fit values #655

Closed newville closed 4 years ago

newville commented 4 years ago

Description

Using scalar minimizers and numdifftools and parameters with bounds, the reported value for the last parameter in the list can be (slightly) from the actual best-fit value.

I believe this is due to Miniminze._calculate_covariance_matrix and specifically numdifftools.Hessian allowing the values to change and not fully restoring the values. I'm not sure why this affects only the final variable, but it appears to be that way.

A Minimal, Complete, and Verifiable example
import numpy as np
from lmfit import Model

best_chisqr = 1e99
best_vals = {'intercept': 0, 'slope': 0}

def linear(x, intercept=0., slope=1.):
    global best_chisqr, best_vals
    prediction = x*slope + intercept
    chisqr = ((y-prediction)**2).sum()
    if chisqr < best_chisqr:
        best_chisqr = chisqr
        best_vals['intercept'] = intercept
        best_vals['slope'] = slope
    return prediction

model = Model(linear)
params = model.make_params()
params['intercept'].set(value=-1, min=-20, max=0)
params['slope'].set(value=1,      min=-100, max=400)

x = np.linspace(0, 9, 10)
np.random.seed(12)
y = x* 1.34 - 4.5 + np.random.normal(loc=0.0, scale=0.25, size=x.size)

result = model.fit(y, x=x, method='nelder', params=params)
print(result.fit_report())

print(" Value       best          reported ")
print(" =====================================")
print(" chisqr     {: .9f}  {: .9f}".format(best_chisqr, result.chisqr))
print(" intercept  {: .9f}  {: .9f}".format(best_vals['intercept'], result.params['intercept'].value))
print(" slope      {: .9f}  {: .9f}".format(best_vals['slope'], result.params['slope'].value))
print(" =====================================")
Error message:

The output of the above script is

[[Model]]
    Model(linear)
[[Fit Statistics]]
    # fitting method   = Nelder-Mead
    # function evals   = 76
    # data points      = 10
    # variables        = 2
    chi-square         = 0.87622740
    reduced chi-square = 0.10952843
    Akaike info crit   = -20.3471472
    Bayesian info crit = -19.7419770
[[Variables]]
    intercept: -4.66080113 +/- 0.19452149 (4.17%) (init = -1)
    slope:      1.33279713 +/- 0.03643700 (2.73%) (init = 1)
[[Correlations]] (unreported correlations are < 0.100)
    C(intercept, slope) = -0.843
 Value       best          reported 
 =====================================
 chisqr      0.876227403   0.876227403
 intercept  -4.660801131  -4.660801131
 slope       1.372998638   1.332797126
 =====================================

Note that although the best chi-square is correctly reported, the value for slope is off slightly.

FWIW, this won't happen if the bounds are removed or if numdifftools is not installed.

Version information

Python: 3.7.7 (default, Mar 23 2020, 22:36:06) [GCC 7.3.0]

lmfit: 1.0.1, scipy: 1.4.1, numpy: 1.18.1, asteval: 0.9.18, uncertainties: 3.1.2

And: numdifftools version 0.9.39.

Link(s)

This began as a discussion on the mailing list: https://groups.google.com/d/msg/lmfit-py/uof1FDWOD7M/d7muQSQyAQAJ

reneeotten commented 4 years ago

fixed in 05bbc07321480d02dcd13c122ba16d42ec3d4eae