runehaubo / lmerTestR

Repository for the R-package lmerTest
50 stars 9 forks source link

step function for backward elimination of random-effect terms returns singular models #37

Closed anuthmann closed 4 years ago

anuthmann commented 4 years ago

I’ve being using the step function to determine an optimal random-effects structure for linear mixed models. Starting with the maximal structure justified by the design, I used the step function for backward elimination of random-effects terms. Given the fairly large number of parameters to estimate for the max structure, I did not expect the max model(s) to converge, but I expected the algorithm to identify a random-effects structure that was justified by the data. However, in 22 out of 24 analyses for a paper, the final fitted models returned by the algorithm were still plagued by convergence issues in the form of a boundary (singular) fit. Is there a way of telling the step function to dismiss a model that comes with a convergence message (mDVmax_final@optinfo$conv$lme4$messages)? Or is this a bug?

Since this is the first time that I am using this function, it may well be that I did something wrong. Here is the kind of code I’ve been using:

f_m_max <- ' ~ SIZE SAL SCOT + (1 + SIZE SAL SCOT | SUBJECT) + (1 + SIZE SAL SCOT | SCENE)' # formula for maximal model mDVmax <- lmer(f_dv_m_max, data=fixRepv, REML=T, control=lmerControl(optimizer="bobyqa",optCtrl=list(maxfun=1e6))) (mDVmax_step_res <- step(mDVmax, reduce.fixed=F)) mDVmax_final <- get_model(mDVmax_step_res)

package versions: lmerTest version 3.1-0 lme4 version 1.1.-21

Any help is much appreciated.

P.S. In practical terms, I then switched to zero-correlation parameter models, which worked reasonably well.

sofpj commented 4 years ago

I think this is an issue related to an update of the lme4 package. Is it possible for you to send me an email with some code I can run, where this problem appears?

Cheers, Sofie (sofp@dtu.dk)

anuthmann commented 4 years ago

I think this is an issue related to an update of the lme4 package. Is it possible for you to send me an email with some code I can run, where this problem appears?

Cheers, Sofie (sofp@dtu.dk)

Just to confirm that I did so on May 4th. Let me know if you didn't get it.

runehaubo commented 4 years ago

@Sofie Pødenphant Jensen sofp@dtu.dk has this issue been resolved? Or perhaps no longer relevant?

Cheers Rune

On Mon, 15 Jun 2020 at 10:38, anuthmann notifications@github.com wrote:

I think this is an issue related to an update of the lme4 package. Is it possible for you to send me an email with some code I can run, where this problem appears?

Cheers, Sofie (sofp@dtu.dk)

Just to confirm that I did so on May 4th. Let me know if you didn't get it.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/runehaubo/lmerTestR/issues/37#issuecomment-643988437, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHZZLPNZYXW7FN52L2XXJ3RWXMXVANCNFSM4MREEPNA .

runehaubo commented 4 years ago

Dear Antje,

I'm taking the liberty of putting this back on github (if my email hack goes well, that is) and I hope that is alright with you.

Antje, I think you are missing the point here, so let me try to explain. I think the main aspect here is whether to interpret a model with estimated parameters on the boundary of the parameter space as a problem in itself.

As Sofie explains, it is quite common for models with complex random-effects structures (here meaning a covariance matrix for a random-effects term of dimension 3 or more) that the number of parameters estimated away from their boundary to be less than the number of free parameters theoretically possible - this is the number of parameters that lme4::lmer attempts to estimate. In this case lmer tries to detect this and if the detection is successful lmer returns the well known singularity message. This does not, however, mean that the model is necessarily "illegal". It might be the case that the specification of the model can be meaningfully reduced but often that is not the case.

To take an example a random-effects term with a 4x4 VCOV matrix will attempt to estimate a model with 4(4+1)/2=10 parameters, however it may be the case that there are, say, two superfluous parameters (eg. perhaps they can be expressed as linear combinations of the remaining parameters or they are estimated at the boundary of the parameter space). In this case you will see the singularity warning, but there is in general no way (within the scope of lme4::lmer) to specify a model with 8 free parameters that estimates the same model with all parameters away from their boundary. The first point is that the model is completely valid even if (slightly) overspecified. The second point is that if you reduce the model, eg. by specifying a 3x3 VCOV matrix, you are asking for a model with 3(3+1)/2=6 parameters thus you are asking, not only for a different model, but also for a model that is smaller than the one you specified with a 4x4 VCOV matrix even though 2 parameter were estimated on the boundary of the parameter space (or linearly dependent).

There is no good solution to this other than to accept that you have two models at play, the 3x3 and 4x4 models and that both are valid and with valid predictions. In most use-cases the 4x4-model would become completely estimable with more data that doesn't change anything materially. Naturally you get into trouble if you attempt to interpret the individually estimated parameters, but that is also ill advised: the parameters are sensitive to estimability issues and other things and basically just mathematical abstractions constructed to make the machinery work - it is the model predictions or fitted values which are real and interpretable. Only in "lucky" cases when the parameters are unique representations of the fitted values are the parameters safe to interpret.

lmerTest and its step function is not concerned whether parameters are interpretable in any particular way so it just muscles on to reduce terms where it can.

So after much talk, my advice is that you understand the singularity message as a warning not to interpret the parameters - not as a warning that there is a problem with the model in itself.

Convergence issues may of course occur for other reasons than parameters on the boundary but that does not change the arguments above other than it may be possible to reduce convergence problems by changing the optimiser or 'optimising' the optimiser settings.

Cheers Rune

On Tue, 20 Oct 2020 at 19:40, Antje Nuthmann wrote:

Dear Sofie (and Rune in cc),

Thank you for taking the time to reply, despite being on maternity leave, and apologies for my own delay in response. I’m aware of the issues you describe in below.

Since Rune asked (yesterday) whether the issue has been solved (no) or whether it is no longer relevant (it still is), I thought I share a few thoughts and things I learned since our last email exchange.

Doug Bates (I met him at a workshop in September) advised to always use "control = [g]lmerControl(calc.derivs = FALSE)", see https://cran.r-project.org/web/packages/lme4/vignettes/lmerperf.html This reduces the number of convergence warnings, making the issue a little less pressing.

Still, I believe that users would finde the ‘step’ function much more useful, if it did (or at least allowed) what I suggested in my original post on GitHub: dismiss a model that comes with a convergence message (whether a conv. warning exists can be checked via "mymodel@optinfo $conv$lme4$messages”).

I managed to look at the source code of the ‘step’ function via "edit(getAnywhere('step'), file='source_lmerTest.r’)”, hoping that I could identify the bit of code that needs changing. I couldn’t as (a) this code is very complex and (b) I am not a professional programmer.

Best, Antje

On 23 Jul 2020, at 19:30, Sofie Pødenphant Jensen sofp@dtu.dk wrote:

Dear Antje

The step function simply reduces the random part of mixed models by performing likelihood ratio tests to check if some of the random effects are non-significant. It does not check for convergence issues related to the models. In complex mixed models (like random slopes models with interactions), singularity occurs very often - even when there are no variances close to zero (or correlations close to -1 or 1). Therefore it is a complicated matter. Here is a citation from the lme4 manual:

"For scalar random effects such as intercept-only models, or 2-dimensional random effects such as intercept+slope models, singularity is relatively easy to detect because it leads to random-effect variance estimates of (nearly) zero, or estimates of correlations that are (almost) exactly -1 or 1. However, for more complex models (variance-covariance matrices of dimension >=3) singularity can be hard to detect; models can often be singular without any of their individual variances being close to zero or correlations being close to +/-1."

I am sorry that the lmerTest package cannot help you with this type of issues.

Best, Sofie

runehaubo commented 4 years ago

I'm closing the issue here as I don't think this is a problem with lmerTest or step (which works as intended), but feel free to re-open if evidence of a problem/bug emergers, or if you have/produce a reproducible example of step working inappropriately.

Cheers Rune