simsem / semTools

Useful tools for structural equation modeling
74 stars 35 forks source link

Measurement equivalence of intercepts for binary indicators in `measEq.syntax()` #102

Closed mccarthy-m-g closed 2 years ago

mccarthy-m-g commented 2 years ago

I'm trying to reproduce a lavaan model for testing longitudinal measurement equivalence of intercepts for a set of binary indicators using the semTools::measEq.syntax() function but cannot seem to do so. I'm not sure if I'm specifying something incorrectly here or if this is a bug, so I thought I'd flag it here. I've included a reprex below.

For context, the reprex is based on (part of) Example 2.8 from Jason Newsom's Longitudinal Structural Equation Modeling book. I've been able to reproduce the other longitudinal measurement equivalence examples from the book, but not this one.

I'm using the most recent version of lavaan and the development version of semTools.

# Setup
library(lavaan)
library(semTools)

socex <- data.frame(
  w1unw1d = c(1,1,1,0,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1),
  w1unw2d = c(1,1,1,1,1,1,1,1,1,0,1,1,0,1,1,1,1,1,1,1,0,1,1,1,1),
  w1unw3d = c(1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,0,1,1,1,1),
  w2unw1d = c(1,1,1,0,1,1,1,0,1,1,1,1,1,0,1,1,0,1,1,1,0,1,1,1,1),
  w2unw2d = c(1,1,1,1,1,0,1,1,1,1,1,0,1,1,1,1,1,1,1,1,0,1,1,1,1),
  w2unw3d = c(1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,0,1,1,1)
)
ordered_indicators <- c("w1unw1d", "w1unw2d", "w1unw3d", "w2unw1d", "w2unw2d", "w2unw3d")

# Desired model
model_1a <- '
  w1unw =~ 1*w1unw1d + a*w1unw2d + b*w1unw3d # Factor loading equality constraint
  w2unw =~ 1*w2unw1d + a*w2unw2d + b*w2unw3d # Factor loading equality constraint

  # Variances & covariances
  w1unw ~~ w2unw
  w1unw ~~ c*w1unw # Factor variance equality constraint
  w2unw ~~ c*w2unw # Factor variance equality constraint

  w1unw1d ~~ w2unw1d
  w1unw2d ~~ w2unw2d
  w1unw3d ~~ w2unw3d

  # Intercepts
  w1unw ~ 1
  w2unw ~ 1

  # Thresholds
  w1unw1d | 0*t1 
  w1unw2d | d*t1 # Intercept equality constraint
  w1unw3d | e*t1 # Intercept equality constraint

  w2unw1d | 0*t1
  w2unw2d | d*t1 # Intercept equality constraint
  w2unw3d | e*t1 # Intercept equality constraint
'

fitmodel_1a <- sem(
  model_1a,
  data = socex,
  parameterization = "theta",
  estimator = "WLSMV",
  ordered = ordered_indicators
)

# Attempt to duplicate with the semTools::measEq.syntax() function. Note that
# this does not match the desired model, but I'm not sure what I've misspecified
# below.
configural_model <- "
  w1unw =~ w1unw1d + w1unw2d + w1unw3d
  w2unw =~ w2unw1d + w2unw2d + w2unw3d
"
longitudinal_factor_names <- list(
  unw = c("w1unw", "w2unw")
)

model_1b <- measEq.syntax(
  configural.model = configural_model,
  longFacNames = longitudinal_factor_names,
  ordered = ordered_indicators,
  parameterization = "theta",
  ID.fac = "std.lv",
  ID.cat = "Wu.Estabrook.2016",
  long.equal = c("loadings", "lv.variances", "intercepts", "thresholds"),
  auto = TRUE,
  data = socex
)
model_1b <- as.character(model_1b)

fitmodel_1b <- sem(
  model_1b,
  data = socex,
  parameterization = "theta",
  estimator = "WLSMV",
  ordered = ordered_indicators
)

# Compare the two models, they should have equal values but do not. Is this
# due to misspecification or a bug?
fitMeasures(fitmodel_1a, c("chisq", "df", "pvalue"))
#>  chisq     df pvalue 
#>  2.492 10.000  0.991
fitMeasures(fitmodel_1b, c("chisq", "df", "pvalue"))
#>  chisq     df pvalue 
#>  1.210  7.000  0.991

# If model_1b is fit using delta parameterization then the same chisq is given
# for the desired model and the one generated by the measEq.syntax() function.
fitmodel_1c <- sem(
  model_1b,
  data = socex,
  parameterization = "delta",
  estimator = "WLSMV",
  ordered = ordered_indicators
)

# But the degrees of freedom are still different, and this approach also goes
# against what's recommended in the examples of the measEq.syntax() function.
fitMeasures(fitmodel_1a, c("chisq", "df", "pvalue"))
#>  chisq     df pvalue 
#>  2.492 10.000  0.991
fitMeasures(fitmodel_1c, c("chisq", "df", "pvalue"))
#>  chisq     df pvalue 
#>  2.492  7.000  0.928

Created on 2021-10-24 by the reprex package (v2.0.0)

TDJorgensen commented 2 years ago

Sorry, this escaped my attention (that was a busy week for me).

The reason measEq.syntax() uses 3 more df than the model you specified is that it will free residual variances in wave 2 when the thresholds, loadings, and intercepts are constrained for binary indicators.

When using the delta parameterization, it will instead free marginal (total) variances via the scaling parameters in wave 2. But I think the problem there is that you still used model_1b, which specifies a model assuming the theta parameterization. That is, model_1b frees the residual variances, which has no effect on fit because they are not model parameters; and the scaling factors remain fixed to 1, so the fit ("chisq") is the same with fewer df counted. Instead, specify another model_1c using measEq.syntax(), as you did for model_1b.