JuliaStats / MixedModels.jl

A Julia package for fitting (statistical) mixed-effects models
http://juliastats.org/MixedModels.jl/stable
MIT License
402 stars 47 forks source link

GLMM estimation depends on order of fixed effects? #528

Closed LeonardoJost closed 3 years ago

LeonardoJost commented 3 years ago

Hi, I ran into an issue that estimated GLMMs depend on the order of fixed effects in the formula. I have attached an example below. I added some randomness to the data but the estimated models do not really make sense. All models are calculated without errors and return FTOL_REACHED, so I would naively assume that everything works. However, the models m1 and m2 have different estimates and deviances. When estimating the significance of A&B&C i thus get different values depending on if I compare model m3 with m1 or m2. I would assume that choosing a similar order for the comparison would work best (i.e. comparing m3 to m2), but in this example it does not work at all so this does not seem to be a general solution.

This also happens for real data and to a lesser extent also when reducing the random effects structure. For example, when using only random intercepts by id there are still small differences between the estimated coeficients and this random effects structure should be supported by the data.

Is there some mistake by me or is there a way to resolve this? Is there a "correct" way to order the formula?

using MixedModels, Random, Distributions
Random.seed!(25)
N = 10000
d=Binomial(1,0.5)
d1=Binomial(1,0.6)
d2=Binomial(1,0.7)
d3=Binomial(1,0.8)
df=DataFrame(A=rand(d,N)+repeat([-1,0,0,1],inner=[Int(N/4)]),
            B=rand(d,N)+repeat([1,0,0,1],inner=[Int(N/4)]),
            C=rand(d,N)+repeat([0,1,0,1],inner=[Int(N/4)]),
            D=rand(d,N)+repeat([-1,1,-1,1],inner=[Int(N/4)]),
            id=repeat(["a","b","c","d"],inner=[Int(N/4)]),
            item=repeat(["a","b","c","d","e"],outer=[4],inner=[Int(N/20)]),
            response=[rand(d,Int(N/4));rand(d1,Int(N/4));rand(d2,Int(N/4));rand(d3,Int(N/4))])
modelFormula=@formula(response~
                          A*B*C+D+
                          (A+B|id)+(C+D|item))
m1=fit(MixedModel,modelFormula,df,Binomial())
modelFormula=@formula(response~
                          A&B&C+B*C+A*B+A*C+D+
                          (A+B|id)+(C+D|item))
m2=fit(MixedModel,modelFormula,df,Binomial())
modelFormula=@formula(response~
                          B*C+A*B+A*C+D+
                          (A+B|id)+(C+D|item))
m3=fit(MixedModel,modelFormula,df,Binomial())
show(MixedModels.likelihoodratiotest(m1,m3))
show(MixedModels.likelihoodratiotest(m2,m3))
show(MixedModels.likelihoodratiotest(m1,m2))

#random intercepts by id only
modelFormula=@formula(response~
                          A*B*C+D+
                          (1|id))
m1a=fit(MixedModel,modelFormula,df,Binomial())
modelFormula=@formula(response~
                          A&B&C+B*C+A*B+A*C+D+
                          (1|id))
m2a=fit(MixedModel,modelFormula,df,Binomial())
dmbates commented 3 years ago

I realize that this is a constructed example but with models m1, m2 and m3 you are trying to fit a 3 by 3 covariance matrix with 6 distinct elements when there are only 5 or 4 distinct levels of id or item. If I try to fit m1 with MixedModels v4.0.0-DEV I get

Variance components:
        Column     Variance    Std.Dev.    Corr.
item (Intercept)  0.000002808 0.001675809
     C            0.000000510 0.000713904 -1.00
     D            0.000522612 0.022860712 +1.00 -1.00
id   (Intercept)  0.379764263 0.616250163
     A            0.000731875 0.027053194 -1.00
     B            0.005630468 0.075036442 -1.00 +1.00

whereas with MixedModels v3.8.0 I get

julia> VarCorr(m1)
Variance components:
        Column     Variance    Std.Dev.    Corr.
item (Intercept)  0.000002560 0.001599867
     C            0.000000487 0.000697943 -1.00
     D            0.000520421 0.022812749 +0.96 -0.96
id   (Intercept)  0.380125092 0.616542855
     A            0.000756583 0.027506056 -1.00
     B            0.005654771 0.075198214 -1.00 +1.00

which just shows the unstable estimation. In a large parameter space with many local optima at different positions on the boundary, many things can affect the progress toward the optimum. Even if you simplify things by setting fast=true it will still be unstable

Variance components:
        Column     Variance    Std.Dev.    Corr.
item (Intercept)  0.000000000 0.000000000
     C            0.000000139 0.000372313   NaN
     D            0.000512268 0.022633332   NaN +0.99
id   (Intercept)  0.379485387 0.616023852
     A            0.000726618 0.026955844 -1.00
     B            0.005618376 0.074955830 -1.00 +1.00

Even a model with just (1|id)+(1|item) converges to a singular result (the estimate of the variance of the random effects for item is zero.

dmbates commented 3 years ago

By the way, thank you for including a reproducible example.

LeonardoJost commented 3 years ago

Thank you for your help. I now played bit more with the constructed example and if the data matches the models better, the differences are indeed much smaller. However, similarly to the models m1a and m2a before, there are still very small differences between the estimated coefficients (and sometimes deviances). I guess this may stem from small errors on float operations on differently ordered matrices? Should these differences be neglected?

For my data analysis, I follow the approach of your Matuschek et al. (2017) paper to reduce the random slopes. For the previous example, this would also work to reduce the random effects structure. However, this does not always work (see example below). I realize that this probably again comes from estimating a 3x3 covariance matrix for only 5 items. But I cannot seem to find any indication of this in the variance components or in the convergence codes. This would be especially problematic if I were to follow the maximal model approach of Barr et al. (2013).

julia> VarCorr(m1)
Variance components:
        Column   Variance Std.Dev.   Corr.
id   (Intercept)  0.196819 0.443642
     A            0.902550 0.950027 +0.19
     B            0.473756 0.688299 -0.01 -0.28
item (Intercept)  0.311414 0.558045
     C            0.295622 0.543711 +0.58
     D            0.382804 0.618712 -0.02 +0.48

Is there any rule on how many random slopes one should estimate given the number of items/participants or any other diagnostics on could/should run? Indeed, this problem does not occur anymore if I increase the number of items to 20. On my real data, I also run into these problems. There I have a 4x4 covariance matrix for 234 participants and 2x2 covariance matrix for 10 items and there is no obvious error. But i also can't reproduce it without using too few items in the simulations so I'm not sure where the problem lies.

Example Code:

using MixedModels, Random, Distributions, DataFrames
Random.seed!(25)
N = 10000
d=Binomial(1,0.5)
#generate data
df=DataFrame(A=rand(d,N),
            B=rand(d,N),
            C=rand(d,N),
            D=rand(d,N),
            id=repeat(["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t"],inner=[Int(N/20)]),
            item=repeat(["a","b","c","d","e"],outer=[20],inner=[Int(N/100)]),
            idRandomFactor1=repeat(rand(Uniform(-1,1),20),inner=[Int(N/20)]),
            itemRandomFactor1=repeat(rand(Uniform(-1,1),5),outer=[20],inner=[Int(N/100)]),
            idRandomFactor2=repeat(rand(Uniform(-1,1),20),inner=[Int(N/20)]),
            itemRandomFactor2=repeat(rand(Uniform(-1,1),5),outer=[20],inner=[Int(N/100)]),
            idRandomFactor3=repeat(rand(Uniform(-1,1),20),inner=[Int(N/20)]),
            itemRandomFactor3=repeat(rand(Uniform(-1,1),5),outer=[20],inner=[Int(N/100)]),
            idRandomFactor4=repeat(rand(Uniform(-1,1),20),inner=[Int(N/20)]),
            itemRandomFactor4=repeat(rand(Uniform(-1,1),5),outer=[20],inner=[Int(N/100)]),)
#generate response variable
df.response=convert.(Int,round.(rand(Uniform(0.3,0.7),N)+
                  #fixed effects
                  0.1*df.A-0.1*df.B+0.1*(df.A.*df.B)-0.03*(df.A.*df.B.*df.C)+0.05*df.D+
                  #random slopes by id
                  0.1*df.idRandomFactor1.*df.A+0.1*df.idRandomFactor2.*df.B+0.05*df.idRandomFactor3+
                  #random slopes by item
                  0.1*df.itemRandomFactor1.*df.C+0.1*df.itemRandomFactor2.*df.D+0.05*df.itemRandomFactor3+
                  #not estimated effects
                  0.3*df.itemRandomFactor4.*df.idRandomFactor4+0.05*df.D.*df.A))
modelFormula=@formula(response~
                          A*B*C+D+
                          (A+B|id)+(C+D|item))
m1=fit(MixedModel,modelFormula,df,Binomial())
modelFormula=@formula(response~
                          A&B&C+B*C+A*B+A*C+D+
                          (A+B|id)+(C+D|item))
m2=fit(MixedModel,modelFormula,df,Binomial())
modelFormula=@formula(response~
                          B*C+A*B+A*C+D+
                          (A+B|id)+(C+D|item))
m3=fit(MixedModel,modelFormula,df,Binomial())
show(MixedModels.likelihoodratiotest(m1,m3))
show(MixedModels.likelihoodratiotest(m2,m3))
show(MixedModels.likelihoodratiotest(m1,m2))

Literature Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. https://doi.org/10.1016/j.jml.2012.11.001 Matuschek, H., Kliegl, R., Vasishth, S., Baayen, H., & Bates, D. (2017). Balancing Type I error and power in linear mixed models. Journal of Memory and Language, 94, 305–315. https://doi.org/10.1016/j.jml.2017.01.001

dmbates commented 3 years ago

issue528.zip

This zip file contains a Pluto notebook, the .html file from running the notebook and the Project.toml file from the directory, The notebook is derived from your code. If you want to run it yourself you should be able to expand the zip file in a clean directory, install the Pluto package if needed then run

using Pluto
Pluto.run()

This brings up a browser window with an Open File selection. You should be able to select issue528.jl and have it run. Please let us know if this doesn't work. We are experimenting with using Pluto notebooks for responding to issues.

It is true that the deviance of fits m1 and m2 are slightly different but that isn't unexpected. As I mentioned before, there are only 5 distinct levels of item and there are 6 parameters in the covariance matrix for the random effects for item being estimated from these 5 levels. This will necessarily produce a very flat likelihood surface. You can't get any precision on these parameters because they can trade off against each other. This imprecision will carry over to the other parameters, including the fixed-effects parameters. When you switch the order of the A, B, C & D columns the order of the initial evaluations of the objective will change (use verbose=true in the call to fit to see the gory details). This may be what causes convergence to another optimum. I don't know enough about the internal details of the BOBYQA algorithm to say for sure.

You may find that using fast=true in the call to fit helps to stabilize the estimates but there are other trade-offs in doing that.

I don't have any hard and fast rules about how complex to allow the covariance structure for the random effects to become. It would have been a lot easier of Barr et al. had written a software package to reliably fit the "maximal" models that they claim must always be used but I haven't seen any software from them. :smile:

LeonardoJost commented 3 years ago

Thank you so much for your help. I'll then try my best to reduce the models and report plausible values while acknowledging possible small uncertainities.

The Pluto package takes some getting used to but works very well. Thank you also for the feedback on my code. I'm still new to Julia and especially generating distributions is still much easier in R for me.

dmbates commented 3 years ago

Can we close the issue now or are there further considerations you would like to discuss?