tBuLi / symfit

Symbolic Fitting; fitting as it should be.
http://symfit.readthedocs.org
MIT License
232 stars 17 forks source link

MINPACK minimizer with multiple datasets #292

Open Jhsmit opened 4 years ago

Jhsmit commented 4 years ago

Running the fitting with multiple datasets doesnt work with the MINPACK minimizer. The example from the docs:


from symfit import variables, parameters, exp, Model, Fit
from symfit.core.minimizers import MINPACK
import numpy as np

x_1, x_2, y_1, y_2 = variables('x_1, x_2, y_1, y_2')
y0, a_1, a_2, b_1, b_2 = parameters('y0, a_1, a_2, b_1, b_2')

model = Model({
    y_1: y0 + a_1 * exp(- b_1 * x_1),
    y_2: y0 + a_2 * exp(- b_2 * x_2),
})

xdata1 = np.linspace(0, 10, num=10, endpoint=True)
xdata2 = np.linspace(0, 10, num=20, endpoint=True)

ydata1 = 20 + 80*np.exp(-5*xdata1)
ydata1 += 2*np.random.rand(len(ydata1))
ydata2 = 20 + 70*np.exp(-5*xdata2)
ydata2 += 5*np.random.rand(len(ydata2))

fit = Fit(model, x_1=xdata1, x_2=xdata2, y_1=ydata1, y_2=ydata2, minimizer=MINPACK)
fit_result = fit.execute()

print(fit_result)

Gives:

ValueError: operands could not be broadcast together with shapes (10,) (20,)

Other minimizers do work and reducing the datasets to one also does work.

Jhsmit commented 4 years ago

I have a note somewhere that claims that MINPACK also has problems with VectorLeastSquares. Don't have an example atm.

tBuLi commented 4 years ago

Yeah I think that is a fundamental problem with the design of MINPACK (or the scipy wrapping of it, don't know at what level this occurs.)

Basically, the scipy API wants a function which returns the residual per data point: (y_i - f(x_i, *params))/sigma_i. It will then take the quadratic sum itself.

In order to do global fitting we compute the chi2 per component of the model, and minimize their sum: chi2 = \sum_{n=1}^{components} chi2_n

I guess we could redefine VectorLeastSquares such that it gives the residuals of each component appended to each other into a one dimensional list, since that should be equivalent. But if it also uses these residuals in the calculation of the jacobian, which it probably does, then we have to be careful to check that they are also equivalent on that level.