ENH: more general way for fit constrained (GenericLikelihoodModel) - another round

GenericLikelihoodModel added the pattern for fixing params a long time ago, based on parameter transformation or expansion/reduction.

The only current usecase of the GenericLikelihoodModel version, AFAIK, is in miscmodel.TModel, only for fixing some params
statespace MLEModel uses a similar pattern for fit_constrained. AFACS also only for fixing some params
some count models like NB, NBP and GPP use _transparams attribute to select whether to switch between linear and exp ilink for extra dispersion parameters
(without checking current status) stationarity in tsa models like ARIMA was imposed by predefined parameter transformation.
SVAR uses a mask for identifying restrictions, all are fixed params type constraints.

My generic fit_constrained used in GLM also transform exog in addition to params and uses offset.

issue here

make constraints based on parameter transformation into a usable, general pattern
extend it to more general cases than fix_params

fix_params only uses a mask for parameters fixed at user given values.

can we extend this to

linear restrictions similar to generic fit_constrained. which constraints can we handle this way.
- note related issue was in inference for hypothesis tests for linear constraints (somewhere I used the transformation restriction
- can we get a linear transformation matrix p_constrained = H dot p_unconstrained and inverse function? linalg tools
affine restrictions; do we need offset or not?
nonlinear restriction: p_constrained = H(p_unconstrained) if available this would follow the same pattern with transform_params

fix params: constraint on individual parameters, e.g. zero constraint, p1 = value = 0

exog transform: if value=0, then drop that exog if value not zero, then use x1 * value as offset, (requires offset
parameter transformation, simple mask to add/drop the fixed param with specific value as in current pattern

simple case : impose two parameters are the same: p1 = p2 = p

exog transfromation use x1 + x2 as regressor, optimize reduced model,
parameter transformation: optimize w.r.t. reduced params p and compute loglike based on p1, p2 (maybe I have some code for this in orthogonal complement linalg tools) (will in general not be unique)

nonlinear case: e.g. p1 / p2 = value or implicit function g(p1, p2) = value parameter transformation, max w.r.t. p2, in loglike set p1 = p2 * value But in general case, we need to make implicit function into function that maps constrained to unconstrained parameter space.

specific cases currently not supported to get started:

extra dispersion params in count models NBP, ...
models with two exog and cross-parameter restrictions, e.g. Zero-Inflated models or mean-scale in dispersion models like Beta regression.

Problem with implementing parameter transformatiion is that the unconstrained score and hessian are not appropriate for the constrained optimization.

Similar problem is already in hardcoded optional transformation for extra variance params in count models. We need to add derivatives of the transformation depending on whether the transformation is used or not. It also depends on whether the optimization method uses the derivatives. We had problems and might still have some in the newer countmodels like NBP and GPP.

For that case it would be cleaner to add a link for the extra params, given that we have explicit derivatives for link functions.

OrderedModel parameterization is messy. I parameterize the model to impose monotonicity constraint across parameters, but then derivatives, score and hessian, for those became to complex and I did not add analytical derivatives.

The current generic fit_constrained is easier because it has a well defined auxiliary model without constraints.

fix_params constraints are also easy because either the derivative is unconstrained or the derivative drops out (zero or not used)

statsmodels / statsmodels

ENH: more general way for fit constrained (GenericLikelihoodModel) - another round #7567