Implement ordinal regression structures

StaffanBetner commented 5 years ago

It would be very useful if glmmTMB supported different types of ordinal regression, e.g. the proportional odds model (cumulative link logit model).

bbolker commented 5 years ago

It's unlikely; it wouldn't be impossible to implement, but there are lots of higher priorities. Can you comment on what functionality you want that is not covered by the ordinal package on CRAN ?

StaffanBetner commented 5 years ago

The ordinal package is a bit of a mess, unfortunately. The old implementation (cl(m)m2) only support one random intercept, and the new one doesn't have any predict method (which makes it incompatible with DHARMa) or scale/nominal effects. It also does not support REML estimation.

bbolker commented 5 years ago

Fair enough. On the other hand:

the ordinal package is quite mature and well-tested
Implementing a predict method for clmm in the ordinal package would probably be easier than writing an implementation of ordinal models from scratch
REML would probably be easier to implement in glmmTMB, but my personal opinion is that its value is a little bit overrated ...

willtudorevans commented 3 years ago

Just to second the request for glmmTMB to add ordinal regression to its supported families. I'm finding the ordinal package can't work with large datasets in the way glmmTMB can.

mmrabe commented 3 years ago

I'd also like to second this request. Fitting ordinal regression models with scale effects and more than one random factor would be a big plus. I'm not aware of any R package that would be capable of this within a frequentist modelling framework.

strengejacke commented 10 months ago

Can I revive this old issue? Given how fast and easy it seemed to implement ordered beta regression, maybe the underlying TMB package has has developed so well that it is now easier to implement ordinal or multinomial regression?

phalaropus commented 6 months ago

Ordinal regression structures in glmmTMB would welcomed by ecologists and others.

bbolker commented 6 months ago

Chiming back in:

I agree that having ordinal models in glmmTMB would be useful, although:
- clmm does already support multiple REs
- I'm not entirely clear what "scale effects" mean in this context?
- mildly surprising that ordinal does poorly on large data sets
ordered-beta was easier to implement than ordinal would be — all it required was implementing the log-likelihood function, and allowing two shape parameters in the model. For ordinal we need to deal with issues like prediction, simulation, etc etc ...
I would welcome a pull request!

mmrabe commented 5 months ago

Thanks for taking another look, @bbolker! I understand that this is quite demanding but I just wanted to make sure to explain what I meant by "scale effects":

Those are basically effects on the residual variance (or rather on the SD), e.g. to allow the residual variance to differ between experimental conditions. ordinal::clm (fixed effects only) and ordinal::clmm2 (mixed effects but only a single random factor) implement this but the most advanced ordinal::clmm does not.

I had a particular model in mind with an ordinal response variable (confidence), two crossed random factors, and residual variance that systematically differs between two levels of a fixed factor. That is neither suitable for ordinal::clmm because that function has no "scale effects", nor for ordinal::clmm2 or ordinal::clm because the model has two random factors.

qdread commented 3 weeks ago

I would like to second this. We are specifically looking for a frequentist implementation in R of the following model:

Cumulative logit model
Proportional odds assumption is violated (i.e. we need category-specific effects, otherwise known as nominal effects)
Two nested random effects with 0 covariance
No scale effects as @mmrabe mentioned above

This is currently not possible with ordinal::clmm2, which allows the category-specific effects but not nested random effect design, nor with ordinal::clmm, which allows the nested random effect design but not the category-specific effects. I have coded the model in Stan, and there is also support for it in brms. However there is no frequentist implementation in R that I can find. I have the same problem as @mmrabe where clmm2 has support for esoteric versions of ordinal multinomial models (scale effects and nominal effects) but clmm has support for more complex random effect structures.

I would be interested in trying to learn how to implement my own family in glmmTMB to do this, but I do not have the experience in C++ coding to do that. Any input would be greatly appreciated.

bbolker commented 2 weeks ago

I haven't thought about this in a little while, but I think that this can be nearly entirely implemented on the R side, by constructing specialized $\mathbf X$ and $\mathbf Z$ matrices (and maybe creating pseudo-observations) ... in which case very little C++ coding (maybe none?) Internally, I think the model would look like a binomial where the logit link would take care of the baseline proportional odds assumption. If I'm right, the hard part would be dealing with all of the implicit general assumptions we're making that would be broken by this new family ...

What does a minimal implementation look like in Stan?

qdread commented 2 weeks ago

Thanks for thinking about it! Here is an implementation of the model discussed in Chapter 14 of Stroup et al.'s GLMM book in Stan. I have provided Stan code for the proportional odds model and the model where proportional odds assumption is relaxed (i.e. there are category-specific, AKA nominal, effects). Both models have nested random effects.

https://github.com/qdread/stan_multinomial_poinsettia

The repo has the two Stan scripts, example dataset, and R script to fit the Stan models. The example dataset is supposed to represent poinsettias of 3 different varieties grown by 12 different growers that are rated on an ascending scale C, B, A. Variety is the treatment and grower is the block.

bbolker commented 2 weeks ago

Embarrassingly, I don't have a copy of Stroup's book (I need to get one). Your Stan code is a good start but for glmmTMB we need to figure out how to implement everything so that it (1) generalizes nicely to any number of levels; (2) works mostly by linear algebra/matrix computation rather than indexing parameter vectors ... I may be able to start from your code and get to something standalone based on matrix computation, then think about how to incorporate it into glmmTMB without breaking anything ...

qdread commented 2 weeks ago

Thanks for taking a look! I will try to work on the Stan implementation to turn the loops + indexing into a proper matrix computation, and update you if/when I have something working.

qdread commented 2 weeks ago

Hi again, I have updated the Stan code in the repo. I've turned the fixed and random effects into design matrices. Now the linear predictors are made by summing up the products of matrix multiplications. So that part is now based on matrix computation and is generalizable to however many levels. I also rewrote the inverse link part so that it is generalizable to >3 levels, though there may be a better way to do that part. Let me know if there is any other way I can help.

bbolker commented 2 weeks ago

FWIW, from the ordinal package vignette:

The restriction of the threshold parameters {θj } being non-decreasing is dealt with by defining ℓ(θ, β; y) = ∞ when {θj } are not in a non-decreasing sequence. If the algorithm attempts evaluation at such illegal values step-halving effectively brings the algorithm back on track. Other implementations of CLMs re-parameterize {θj } such that the non-decreasing nature of {θj } is enforced by the parameterization, for example, MASS::polr (package version 7.3.49) optimize[s] the likelihood using

$$ \tilde θ_1 = θ_1, \tilde θ_2 = \exp(θ_2 − θ1), . . . , \tilde θ{J−1} = \exp(θ{J−2} − θ{J−1}) $$

This is deliberately not used in ordinal because the log-likelihood function is generally closer to quadratic in the original parameterization in our experience which facilitates faster convergence.

glmmTMB / glmmTMB

Implement ordinal regression structures #514