tBuLi / symfit

Symbolic Fitting; fitting as it should be.
http://symfit.readthedocs.org
MIT License
235 stars 19 forks source link

shape mismatch for sum #161

Closed jbarnoud closed 6 years ago

jbarnoud commented 6 years ago

I try to run the following snippet but get an exception:

import symfit

temperature = symfit.Variable('temperature')
length = symfit.Variable('length')
shift = symfit.Parameter('s', value=300)
model = {length: shift + 1 / (1 + symfit.exp(- temperature))}

temp_data = [270, 280, 285, 290, 295, 300, 310, 320]
length_data = [8.33, 8.41, 8.45, 8.5, 8.54, 9.13, 9.27, 9.4]
fit = symfit.Fit(model, length=length_data, temperature=temp_data)
fir_result = fit.execute()

The exception:

ValueError                                Traceback (most recent call last)
<ipython-input-4-47e2b6349ab2> in <module>()
      9 length_data = [8.33, 8.41, 8.45, 8.5, 8.54, 9.13, 9.27, 9.4]
     10 fit = symfit.Fit(model, length=length_data, temperature=temp_data)
---> 11 fir_result = fit.execute()

~/dev/symfit/symfit/core/fit.py in execute(self, **minimize_options)
   1389         else:
   1390             if cov_matrix is None:
-> 1391                 cov_matrix = self.covariance_matrix(dict(zip(self.model.params, minimizer_ans._popt)))
   1392         finally:
   1393             minimizer_ans.covariance_matrix = cov_matrix

~/dev/symfit/symfit/core/fit.py in covariance_matrix(self, best_fit_params)
    873             if len(set(arr.shape for arr in self.sigma_data.values())) == 1:
    874                 # Shapes of all sigma data identical
--> 875                 return self._cov_mat_equal_lenghts(best_fit_params=best_fit_params)
    876             else:
    877                 return self._cov_mat_unequal_lenghts(best_fit_params=best_fit_params)

~/dev/symfit/symfit/core/fit.py in _cov_mat_equal_lenghts(self, best_fit_params)
    951             jac = jac * np.ones_like(W)
    952         # Dot away all but the parameter dimension!
--> 953         cov_matrix_inv = np.tensordot(W*jac, jac, (range(1, jac.ndim), range(1, jac.ndim)))
    954         cov_matrix = np.linalg.inv(cov_matrix_inv)
    955         return cov_matrix

/coarse/jon/Envs/apricot/lib/python3.6/site-packages/numpy/core/numeric.py in tensordot(a, b, axes)
   1281                 axes_b[k] += ndb
   1282     if not equal:
-> 1283         raise ValueError("shape-mismatch for sum")
   1284 
   1285     # Move the axes to sum over to the end of "a"

ValueError: shape-mismatch for sum

According to pip show I am using version 0.3.3.dev487 of symfit, version 1.1.1 of sympy, and 1.14.3 of numpy.

I try to simplify the model to {length: shift - temperature} and I get the same error.

jbarnoud commented 6 years ago

I updated to numpy-1.14.5 and symfit-0.4.1 and the issue remains. The snippet has to be adapted to explicitly use the name keyword argument for variables and parameters.

tBuLi commented 6 years ago

Interesting. Could you try to use the master branch instead of the latest release? I remember something related to this came up recently and was patched but I'm not sure it's in the latest release yet.

Some work arounds might be to call Fit with sigma_length=None or sigma_length=np.ones_like(length_data) to see if this solves your problem.

Let me know if this problem persist in the current master branch.

jbarnoud commented 6 years ago

I just tried the latest version on master (1344b80e7cf8d09ec0b2c777b17982ca9eaa98de) and got the same result. I also tried the two suggested workarounds and got the same exception.

tBuLi commented 6 years ago

Btw, I don't understand your remark "The snippet has to be adapted to explicitly use the name keyword argument for variables and parameters." Does that mean that if you do that, the problem is gone?

I have not had time yet to run your snippet, but hopefully I will have some next week.

pckroon commented 6 years ago

I've seen the code and issue in action, and if you specify variable and parameter names it still breaks.

tBuLi commented 6 years ago

Ok, I've found the origin of the problem. Because this model has a single additive Parameter, symfit is not able to determine the shape of the data on the level of the Model. (Because Model's don't have knowledge about the data.)

So in Model.elav_jacobian there is some logic which makes sure all the components of the Jacobian have the same shape, but in this case the jacobian is only the derivative w.r.t shift, which is 1. Because there is no other component, the jacobian returned is therefore [[[1]]] (lists correspond to param, component, datapoints).

But that is not correct, it should be [[ 8 [1]]] in your example in order for it to be contracted with the weight matrix W in the line `cov_matrix_inv = np.tensordot(Wjac, jac, (range(1, jac.ndim), range(1, jac.ndim)))`

Now the question is, what to do about it in light of the plans for 1.0.0, which will upgrade Variables to indexed objects, which will make it easier to prevent these issues in the first place because Fit can then work with a deepcopy of the Model whose Variable shapes have been set on the bases of the data. This will make such problems a thing of the past.

So I'm now unsure wether to add a quick try/except for this specific type of problem, or to set this to wontfix because it will be fixed in the future in a far more elegant way.

As a workaround for your immediate problems btw, I can now suggest using

k = symfit.Parameter('k', value=1, fixed=True)
model = {length: shift + 1 / (1 + symfit.exp(- k* temperature))}

instead. This way the shape can be correctly estimated.

pckroon commented 6 years ago

Now the question is, what to do about it in light of the plans for 1.0.0, which will upgrade Variables to indexed objects, which will make it easier to prevent these issues in the first place because Fit can then work with a deepcopy of the Model whose Variable shapes have been set on the bases of the data. This will make such problems a thing of the past.

Sounds like a good solution, but it really depends on when you intend to release this. I it's more than a month away I think a quickfix is a good idea.

tBuLi commented 6 years ago

You are right. I could say soon but since no project in human history is ever delivered according to schedule, a quick fix is probably in order ;).