tBuLi / symfit

Symbolic Fitting; fitting as it should be.
http://symfit.readthedocs.org
MIT License
233 stars 17 forks source link

Found breaking cornercases #223

Open pckroon opened 5 years ago

pckroon commented 5 years ago

I'm in the progress of setting up some testing using random models, and I'll use this issue to document some of the breaking edge cases I'm coming across.

model = [Y1(x; ) = x + 5]
{'sigma_Y1': None, 'Y1': array([5.]), 'x': array([0.])}

test_hypothesis.py:35: in test_simple
    result = fit.execute()
../symfit/core/fit.py:1430: in execute
    minimizer_ans = self.minimizer.execute(**minimize_options)
../symfit/core/support.py:424: in wrapped_func
    return func(*bound_args.args, **bound_args.kwargs)
../symfit/core/minimizers.py:417: in execute
    return super(ScipyGradientMinimize, self).execute(jacobian=jacobian, **minimize_options)
../symfit/core/support.py:424: in wrapped_func
    return func(*bound_args.args, **bound_args.kwargs)
../symfit/core/minimizers.py:351: in execute
    **minimize_options
../../../../../.virtualenvs/symfit/lib/python3.6/site-packages/scipy/optimize/_minimize.py:597: in minimize
    return _minimize_bfgs(fun, x0, args, jac, callback, **options)
../../../../../.virtualenvs/symfit/lib/python3.6/site-packages/scipy/optimize/optimize.py:963: in _minimize_bfgs
    gfk = myfprime(x0)
../../../../../.virtualenvs/symfit/lib/python3.6/site-packages/scipy/optimize/optimize.py:293: in function_wrapper
    return function(*(wrapper_args + args))
../symfit/core/minimizers.py:156: in resized
    out = func(*args, **kwargs)
../symfit/core/objectives.py:285: in eval_jacobian
    ordered_parameters, **parameters
../symfit/core/objectives.py:145: in eval_jacobian
    result = self.model.eval_jacobian(**key2str(parameters))._asdict()
../symfit/core/fit.py:827: in eval_jacobian
    jac[idx] = np.stack(np.broadcast_arrays(*comp))
E           ValueError: need at least one array to stack
pckroon commented 5 years ago
[Y1(; a) = a + 5,
 Y2(; a) = (-4)**a]
{'sigma_Y1': None, 'sigma_Y2': None, 'Y1': array([5.]), 'Y2': array([1.])}

test_hypothesis.py:35: in test_simple
    result = fit.execute()
../symfit/core/fit.py:1430: in execute
    minimizer_ans = self.minimizer.execute(**minimize_options)
../symfit/core/support.py:424: in wrapped_func
    return func(*bound_args.args, **bound_args.kwargs)
../symfit/core/minimizers.py:417: in execute
    return super(ScipyGradientMinimize, self).execute(jacobian=jacobian, **minimize_options)
../symfit/core/support.py:424: in wrapped_func
    return func(*bound_args.args, **bound_args.kwargs)
../symfit/core/minimizers.py:351: in execute
    **minimize_options
../../../../../.virtualenvs/symfit/lib/python3.6/site-packages/scipy/optimize/_minimize.py:597: in minimize
    return _minimize_bfgs(fun, x0, args, jac, callback, **options)
../../../../../.virtualenvs/symfit/lib/python3.6/site-packages/scipy/optimize/optimize.py:963: in _minimize_bfgs
    gfk = myfprime(x0)
../../../../../.virtualenvs/symfit/lib/python3.6/site-packages/scipy/optimize/optimize.py:293: in function_wrapper
    return function(*(wrapper_args + args))
../symfit/core/minimizers.py:156: in resized
    out = func(*args, **kwargs)
TypeError: Cannot cast ufunc subtract output from dtype('complex128') to dtype('float64') with casting rule 'same_kind'

This one is caused by BFGS calling the model/objective with a non-integer value for 'a', causing Y2 to become complex. I'm not sure what to do about this though. We can guard for this in the objective code, but that raises the question of how to interpret complex objective values (minimize the magnitude?).

tBuLi commented 5 years ago

Excellent work, thanks for tackling this.

In response to the second issue: LeastSquares could be made compatible with complex numbers relatively straightforwardly, although it is is slightly more complicated than just minimizing the complex magnitude (The jacobian/hessian change more significantly, as does the case for vector models). However, the next question becomes if complex numbers should be supported by default, or should remain a seperate case. I think the latter, because most use-cases are real so we would negatively impact performance for most use-cases.

One solution that comes to mind is to use sympy assumptions on the variables to hint to symfit that a switch to ComplexLeastSquares is needed:

Y2 = Variable('Y2', complex=True)

From this we could have fit use ComplexLeastSquares instead of LeastSquares. Perhaps it is also possible to use sympy to inspect expressions to see if there is a risk of them turning imaginary so we can raise a more meaningful warning.

pckroon commented 5 years ago

From this we could have fit use ComplexLeastSquares instead of LeastSquares. Perhaps it is also possible to use sympy to inspect expressions to see if there is a risk of them turning imaginary so we can raise a more meaningful warning.

This, together with sympy assumptions opens up a very nasty can of optimisation worms because someone will pass complex=False or real=True. Or integer=True. And note that at this point all three are equivalent. It would require Sympy to be able to propagate these assumptions through the expressions, and translate them to 1) bounds and 2) constraints. This makes #81 relevant again, and that's a very nasty can of optimisation worms; in particular integer constraints. This also means we should pass our parameter bounds as assumptions to Sympy. I think this ComplexLeastSquares discussion warrants its own issue. I'll leave it to you to make one.

I'll try to find a workaround for my tests.