Error with constrained fit

egpbos commented 5 years ago

I'm trying to fit a point distribution to a cone shape. I use ectopylasm for this (can be pip installed as described in the README, it's only used in this issue for generating the point distribution). When I try to fit as follows it works, but the fit is often not great, the main problems being the height and the fact that often the x-axis rotation is $\pi$ away from what I'd like:

import numpy as np
import ectopylasm as ep

import symfit as sf

cone = ep.geometry.Cone(0.5, 0.5)
n_steps = 20
xyz = np.array(ep.geometry.cone_surface(cone, n_steps=n_steps))
xyz = xyz.reshape(3, n_steps*n_steps)
xyz += np.random.normal(0, 0.03, xyz.shape)

h, radius, rot_x, rot_y, bx, by, bz = sf.parameters('h, radius, rot_x, rot_y, bx, by, bz')
x, y, z, f = sf.variables('x, y, z, f')

x_min_b = sf.Matrix([x - bx, y - by, z - bz])  # column matrix

M_x = sf.Matrix([[1, 0, 0],
                 [0, sf.cos(-rot_x), sf.sin(-rot_x)],
                 [0, -sf.sin(-rot_x), sf.cos(-rot_x)]])

M_y = sf.Matrix([[sf.cos(-rot_y), 0, sf.sin(-rot_y)],
                 [0, 1, 0],
                 [-sf.sin(-rot_y), 0, sf.cos(-rot_y)]])

u = M_x @ (M_y @ x_min_b)

cone_model = {
    f: h**2 * (u[0]**2 + u[1]**2) / radius**2 - (u[2] - h)**2
}

cone_fit = sf.Fit(cone_model, x=xyz[0], y=xyz[1], z=xyz[2], f=np.zeros_like(xyz[0]))

cone_fit_result = cone_fit.execute()

print(cone_fit_result)

This will print something like

Parameter Value        Standard Deviation
bx        1.389905e+00 nan
by        -8.714229e-01 nan
bz        1.121955e+00 nan
h         9.955837e-01 nan
radius    1.034978e+01 nan
rot_x     3.086634e+00 7.604867e-02
rot_y     5.656797e-02 4.065809e-02
Status message         Optimization terminated successfully.
Number of iterations   243
Objective              <symfit.core.objectives.LeastSquares object at 0x12052d3c8>
Minimizer              <symfit.core.minimizers.BFGS object at 0x1205aa2e8>

Goodness of fit qualifiers:
chi_squared            0.18392761747750358
objective_value        0.09196380873875179
r_squared              -inf

This is all fine (not ideal, but not the problem I'm focusing on here). However, when I now try to improve my fit by adding the constraint that $|z| \le h$, or actually $u_2 \le h$, i.e. the non-rotated z-axis of the cone should not be higher than the height of the cone (which will hopefully simultaneously constrain the x-rotation and the height), in the following way:

cone_constraints = [
    sf.LessThan(sf.Abs(u[2]), h)
]

cone_fit = sf.Fit(cone_model, x=xyz[0], y=xyz[1], z=xyz[2], f=np.zeros_like(xyz[0]),
                  constraints=cone_constraints)

cone_fit_result = cone_fit.execute()

I get the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-e2514d4dda69> in <module>
      6                   constraints=cone_constraints)
      7 
----> 8 cone_fit_result = cone_fit.execute()

.../symfit/core/fit.py in execute(self, **minimize_options)
   1461         :return: FitResults instance
   1462         """
-> 1463         minimizer_ans = self.minimizer.execute(**minimize_options)
   1464         minimizer_ans.covariance_matrix = self.covariance_matrix(
   1465             dict(zip(self.model.params, minimizer_ans._popt))

.../symfit/core/support.py in wrapped_func(*args, **kwargs)
    422                     else:
    423                         bound_args.arguments[param.name] = param.default
--> 424             return func(*bound_args.args, **bound_args.kwargs)
    425         return wrapped_func
    426 

.../symfit/core/minimizers.py in execute(self, **minimize_options)
    408         if jacobian is None:
    409             jacobian = self.wrapped_jacobian
--> 410         return super(ScipyGradientMinimize, self).execute(jacobian=jacobian, **minimize_options)
    411 
    412     def scipy_constraints(self, constraints):

.../symfit/core/minimizers.py in execute(self, **minimize_options)
    466 
    467     def execute(self, **minimize_options):
--> 468         return super(ScipyConstrainedMinimize, self).execute(constraints=self.wrapped_constraints, **minimize_options)
    469 
    470     def scipy_constraints(self, constraints):

.../symfit/core/minimizers.py in execute(self, **minimize_options)
    428     def execute(self, **minimize_options):
    429         return super(ScipyBoundedMinimizer, self).execute(bounds=self.bounds,
--> 430                                                           **minimize_options)
    431 
    432 

.../symfit/core/support.py in wrapped_func(*args, **kwargs)
    422                     else:
    423                         bound_args.arguments[param.name] = param.default
--> 424             return func(*bound_args.args, **bound_args.kwargs)
    425         return wrapped_func
    426 

.../symfit/core/minimizers.py in execute(self, bounds, jacobian, hessian, constraints, **minimize_options)
    353             jac=jacobian,
    354             hess=hessian,
--> 355             **minimize_options
    356         )
    357         return self._pack_output(ans)

.../scipy/optimize/_minimize.py in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options)
    606     elif meth == 'slsqp':
    607         return _minimize_slsqp(fun, x0, args, jac, bounds,
--> 608                                constraints, callback=callback, **options)
    609     elif meth == 'trust-constr':
    610         return _minimize_trustregion_constr(fun, x0, args, jac, hess, hessp,

.../scipy/optimize/slsqp.py in _minimize_slsqp(func, x0, args, jac, bounds, constraints, maxiter, ftol, iprint, disp, eps, callback, **unknown_options)
    440                 a = zeros((la, n))
    441             else:
--> 442                 a = vstack((a_eq, a_ieq))
    443             a = concatenate((a, zeros([la, 1])), 1)
    444 

.../numpy/core/shape_base.py in vstack(tup)
    281     """
    282     _warn_for_nonsequence(tup)
--> 283     return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
    284 
    285 

ValueError: all the input array dimensions except for the concatenation axis must match exactly

Any idea what's going wrong here?

tBuLi commented 5 years ago

I think the problem is that u[2] is still an array with some shape, not a scalar. The constraint functions have to return a scalar. If I understand you correctly you want u[2] < h for every point in u[2]? If so, you have still got some work ahead of you ;). The naive solution would be to create a separate constraint at each point, but that will probably kill your performance.

To do more advanced stuff with a single constraint model, have a look at the alternative constructor as_constraint. With this you can turn any model into a constraint, including a CallableNumericalModel which gives you a lot more freedom.

pckroon commented 5 years ago

As a second point, as I understand it rot_{x, y, z} should be between -1 and 1 (since they're rotation angles)? If so, you should add the boundaries to their parameters (rot_x.min, rot_x.max = -1, 1), that should make the fitting a bit easier as well.

pckroon commented 5 years ago

Lastly, if you have some computational power to spare, you can try to use the DfiferentialEvolution minimizer, which should find the global minimum.

egpbos commented 5 years ago

@tBuLi:

u[2] is a scalar, u itself is a vector: Schermafbeelding 2019-07-22 om 13 52 59

To give a bit more context of what I'm doing: you can define an implicit function for a cone, which is the one I'm defining as cone_model here. I don't think this completely characterizes a single cone (i.e. it can still give two directions and it doesn't really constrain height), which is why I'm adding the extra constraint. See Mathworld for some more context.

However, this is a cone with it's base (the circle) in the x-y plane and the central axis along the z-axis. I want to describe a general cone anywhere in space and rotated arbitrarily, so I add three coordinates for the location and two rotation angles for the orientation.

@pckroon:

Theoretically, the angles are boundless, since 0 equals 2\pi equals -2\pi equals (+/-)2^n\pi for any integer n. In practice, I guess it may help the solver if I would limit it to be just between 0 and 2\pi. Is this what you mean?

egpbos commented 5 years ago

Ok, I should maybe be more precise.

When I say "base in the x-y plane" I actually mean in the "u[0]-u[1] plane" and "z-axis" would correspond to "u[2] axis". The simple implicit cone equation holds in this u-vector coordinate system, but the coordinates of the points I'm trying to fit to the cone are in x, y, z coordinates. The mapping is done in the code by two rotation matrices and a simple translation (subtraction) of the desired base position.

egpbos commented 5 years ago

I'm doing some debugging by inserting prints in scipy and indeed somehow something seems to think that there is a 400-dimensional thing going on (which undoubtedly corresponds to the 20x20(x3) sized xyz array)...

pckroon commented 5 years ago

Theoretically, the angles are boundless, since 0 equals 2\pi equals -2\pi equals (+/-)2^n\pi for any integer n. In practice, I guess it may help the solver if I would limit it to be just between 0 and 2\pi. Is this what you mean?

Yes. Depending on things like the phase of the moon numerical minimizers can really suffer for degenerate answers. And bounds are easier than constraints.

egpbos commented 5 years ago

Yes, it's a good idea, I'm using bounds now, thanks @pckroon

However, my problem still stands. The model doesn't fully constrain the cone, so I need an extra constraint, which still gives the same exception.

tBuLi commented 5 years ago

@egpbos , maybe I should've been more clear, indeed in your model u[2] is a scalar, but numerically it is not since you call fit with

cone_fit = sf.Fit(cone_model, x=xyz[0], y=xyz[1], z=xyz[2], f=np.zeros_like(xyz[0]))

so x, y, z are some kind of arrays. And so the output of u[2] will be something of the same shape as x. thanks to the wonder of ducktyping ;).

pckroon commented 5 years ago

I see a few ways forward: 1) Make a constraint from a CallableNumericalModel, as @tBuLi suggested 2) Change the constraint to something like sf.Max(u[2]**2) < h**2, or reformulate it in some other way to reduce it to a single value (norm? average? The problem with Max is that it is not differentiable). 3) Get a good initial guess for h and u, for example based on the largest distance found between any two points in your data set, and hope for the best. 4) Run the fit twice: first unconstrained to get an initial guessed value for u and h, and use those to construct a constraint. Run the fit again to find the final answer. Note that this option feels very unsatisfactory.

egpbos commented 5 years ago

Could you help me understand how CallableNumericalModel can be used in this case?

Options 2-4 will not do. The problem, as I think I'm beginning to understand better now, is really that without the constraint, the model is underconstrained. The cone_model equation by itself actually describes a double cone with infinite height, like this: http://mathworld.wolfram.com/DoubleCone.html

Without the height constraint, and in case the points are rather noisy, the cone is likely to find "better" fits than the actual generating single cone, because the upper half of the double cone will fit better to the points at the top of the single cone.

How can I use CallableNumericalModel to make a constraint on u[2] (interpreted as a variable)?

egpbos commented 5 years ago

Ah, now I understand your option 2. I tried it, but it gives the same Exception, unexpectedly...

tBuLi commented 5 years ago

Using CallableNumericalModel you could write any pythonic function you might need. For example:

def constraint(u_2, h):
    """ Do your magic here """
    return np.sum(u_2 < h) - u_2.size

c, u_2 = variables('c, u_2')
constraint = CallableNumericalModel.as_constraint(
    {c: constraint, u_2: u[2]}, 
    connectivity_mapping={c: {u_2, h}},
    constraint_type=LessThan,
    model=model
)

I didn't think about if this is the right constraint at all but hopefully it will demonstrate how you can use this. (The constraint function is then assumed to be of the type lhs - rhs < 0.)

tBuLi / symfit

Error with constrained fit #263