American-Institutes-for-Research / WeMix

WeMix public repository
GNU General Public License v2.0
10 stars 1 forks source link

Scaling of higher-level weights changes the estimates of random effects #4

Closed protivinsky closed 2 years ago

protivinsky commented 2 years ago

The package vignette refers to Rabe-Hesketh and Skrondal (2006) for the model specification and these authors claim in their paper:

"Scaling the weights at the top level L by multiplying by a scalar a simply results in the log-pseudolikelihood being rescaled and therefore does not affect the point estimates. In contrast, scaling the lower level weights does affect the parameter estimates..." (p. 812)

However WeMix estimates of random effects can change substantially if higher-level weights are scaled. Here is an example (using the same dataset as your tests):

library(lme4)
library(WeMix)

data("sleepstudy")
sleepstudyU <- sleepstudy
sleepstudyU$weight1L1 <- 1
sleepstudyU$weight1L2 <- 1
sleepstudyU$weight1L2a <- 0.1
sleepstudyU$weight1L2b <- 10

wm <- mix(Reaction ~ Days + (1|Subject), data=sleepstudyU, weights=c("weight1L1", "weight1L2"))
wma <- mix(Reaction ~ Days + (1|Subject), data=sleepstudyU, weights=c("weight1L1", "weight1L2a"))
wmb <- mix(Reaction ~ Days + (1|Subject), data=sleepstudyU, weights=c("weight1L1", "weight1L2b"))

Snippets of results:

# wm:
Variance terms:
 Level    Group        Name Variance Std. Error Std.Dev.
     2  Subject (Intercept)   1296.9      516.7    36.01
     1 Residual                954.5      220.2    30.90
lnl= -897.0393 

# wma:
Variance terms:
 Level    Group        Name Variance Std. Error Std.Dev.
     2  Subject (Intercept)   1383.6      516.1    37.20
     1 Residual                867.8      200.2    29.46
lnl= -868.9107 

# wmb:
Variance terms:
 Level    Group        Name  Variance Std. Error  Std.Dev.
     2  Subject (Intercept) 1.437e-21  1.196e-17 3.790e-11
     1 Residual             2.251e+03  5.521e+02 4.745e+01
lnl= -950.1465 

The fixed effects estimates are identical for these models, but the random effects estimates shown above vary greatly and seems to be completely broken for the last model (all variance is attributed to residual there). If I do the estimation in Stata (using mixed), everything behaves as expected - the random effects estimates are identical and only log pseudolikelihood scales, as claimed by Rabe-Hesketh and Skrondal (2006).

I wonder if there is an issue with handling level 2 weights? Or am I just misunderstanding something? This is somewhat artificial example indeed, however I am observing the same issue with mixed.sdf via EdSurvey on PISA 2018 data and getting different results to Stata mixed command.

I am using R 4.1.3 on Ubuntu and the latest WeMix (I tried both version 3.2.1 from CRAN and pre-release installation from GitHub).

Thanks.


Rabe-Hesketh, S., & Skrondal, A. (2006). Multilevel modelling of complex survey data. Journal of the Royal Statistical Society. Series A (Statistics in Society), 169 (4 ), 805–827.

pdbailey0 commented 2 years ago

@protivinsky, thanks for including a reproducible MWE.

Is it possible Rabe-Hesketh and Skondal are implying all the weights are also multiplied by the same constant? If so, try cWeight=TRUE.

library(lme4)
library(WeMix)

data("sleepstudy")
sleepstudyU <- sleepstudy
sleepstudyU$weight1L1 <- 1
sleepstudyU$weight1L2 <- 1
sleepstudyU$weight1L2a <- 0.1
sleepstudyU$weight1L2b <- 10

wm <- mix(Reaction ~ Days + (1|Subject), data=sleepstudyU, weights=c("weight1L1", "weight1L2"),cWeights=TRUE)
wma <- mix(Reaction ~ Days + (1|Subject), data=sleepstudyU, weights=c("weight1L1", "weight1L2a"),cWeights=TRUE)
wmb <- mix(Reaction ~ Days + (1|Subject), data=sleepstudyU, weights=c("weight1L1", "weight1L2b"),cWeights=TRUE)

This gives identical results, similarly, snipping out just the variance components.

# wm
 Level    Group        Name Variance Std. Error Std.Dev.
     2  Subject (Intercept)   1296.9      516.7    36.01
     1 Residual                954.5      220.2    30.90
# wma
 Level    Group        Name Variance Std. Error Std.Dev.
     2  Subject (Intercept)   1296.9      516.7    36.01
     1 Residual                954.5      220.2    30.90
# wmb
 Level    Group        Name Variance Std. Error Std.Dev.
     2  Subject (Intercept)   1296.9      516.7    36.01
     1 Residual                954.5      220.2    30.90