Feature request: two-level SEM with random intercepts

andreaphsz commented 6 years ago

Since lavaan v0.6-1 supports two-level SEM with random intercepts, do you intend to implement this feature in semTools for multiple imputations? Thank you very much!

TDJorgensen commented 6 years ago

have you tried it?

andreaphsz commented 6 years ago

Yes, I did. With the Galo example from Yves Rosseels slides, see http://users.ugent.be/~yrosseel/lavaan/tubingen2017/multilevelSEM_part1.pdf , page 121ff. I did the multiple imputation with

library(semTools)
library(mice)

fit.mi <- sem.mi(model, data = Galo, cluster="school", fixed.x = FALSE,
                 verbose = TRUE, std.lv = TRUE, h1 = TRUE,
                 m = 5, miPackage="mice")

and got

Fehler in is.mids(data) : Argument "data" fehlt (ohne Standardwert)

TDJorgensen commented 6 years ago

I found the problem that cause this error message and fixed it. But I tried running the example and it still fails. I found the first problem, which is that runMI() only checks for multiple groups, not multiple levels. So I will need to do some work to figure out what else might need to be updated. I'll keep working with this example, and suggest Yves add some new multilevel lavInspect() output options that I will need. I'll post here again when I make some progress, probably sometime in September.

FYI, imputing the data correctly with mice() requires passing arguments to make sure the two-level structure of the data are accounted for. I would recommend imputing the data first, to make sure you can get that working correctly. Then you can pass a list of imputed data sets to sem.mi().

TDJorgensen commented 6 years ago

OK, it mostly works, but I have only been able to check single-group multilevel model. I tried adding sex as a grouping variable, but it returned an error, so I'll need to find another example to see what might go wrong with multiple groups (probably the fitted() and resid() methods, and therefore fitMeasures() when requesting SRMR.) But most of the methods work for the single-group Galo example from his tutorial. As the comments below indicate, modindices.mi() returns an error (as does modindices() for complete data), and fitMeasures() returns an error for lavaan.mi objects when requesting incremental fit indices (true by default) because there is an error when trying to fit the baseline model. I'll try to resolve these issues later in September, but I think most of what you'll need is already available in the development version. If you want to save the CFI from each imputation, you can use the FUN= argument (see the bottom example on the ?runMI help page).

## install development version
devtools::install_github("simsem/semTools/semTools")

Galo <- read.table("http://users.ugent.be/~yrosseel/lavaan/tubingen2017/Galo.dat")
names(Galo) <- c("school", "sex", "galo", "advice", "feduc", "meduc",
                 "focc", "denom")

# missing data
Galo[Galo == 999] <- NA

model <- '
level: within
  wses =~ a*focc + b*meduc + c*feduc
  advice ~ wc*wses + wb*galo
  galo   ~ wa*wses
  # residual correlation
  focc ~~ feduc
level: between
  bses =~ a*focc + b*meduc + c*feduc
  advice ~ bc*bses + bb*galo
  galo   ~ ba*bses + denom
  feduc ~~ 0*feduc
# defined parameters
  wi := wa * wb
  bi := ba * bb
'
library(lavaan)
fit <- sem(model, data = Galo, cluster = "school", fixed.x = FALSE,
           missing = "fiml", std.lv = TRUE, h1 = TRUE)
fitmg <- sem(model, data = Galo, cluster = "school", fixed.x = FALSE,
             missing = "fiml", std.lv = TRUE, h1 = TRUE, group = "sex") # fails

summary(fit, fit.measures = TRUE, standardized = TRUE)

## impute data quickly, not accounting for nested structure
library(mice)
m <- 5
mice.out <- mice(Galo, m = m, diagnostics = TRUE)
imputedData <- list()
for (i in 1:m) {
  imputedData[[i]] <- complete(data = mice.out, action = i, include = FALSE)
}

library(semTools)
fit.mi <- sem.mi(model, data = imputedData, cluster = "school", fixed.x = FALSE,
                 std.lv = TRUE)
## runs fine, check methods
fit.mi
summary(fit.mi, ci = TRUE, standardized = TRUE, rsquare = TRUE, fmi = TRUE)
coef(fit.mi)
vcov(fit.mi)
nobs(fit.mi)
fitted(fit.mi)
resid(fit.mi, type = "cor") # works, but resid(fit) generates an error
modindices.mi(fit.mi) # modindices(fit) also generates an error
lavTestScore.mi(fit.mi, test = "D2") # lavTestScore(fit, epc = T) generates an error, expects a "group" column
lavTestScore.mi(fit.mi, test = "rubin") # very different results
lavTestWald.mi(fit.mi, constraints = 'wa == ba ; wb == bb')
lavTestLRT.mi(fit.mi, test = "D2")
anova(fit.mi) # D3 takes an obscenely long time
anova(fit.mi, asymptotic = TRUE) # comparable to anova(fit)
fitMeasures(fit.mi) # fails to fit the baseline.model
fitMeasures(fit.mi, fit.measures = c("rmsea","rmr","aic"))

andreaphsz commented 6 years ago

Works like a charm, also with my data. I am curious if modindices.mi will work soon. Thank you very much for your support!

TDJorgensen commented 5 years ago

Turns out modindices() already worked, but it only failed in a particular combination of conditions, reported here:

https://github.com/yrosseel/lavaan/issues/149

So modindices.mi() already worked all this time, as long as you either use integer labels for your levels in the blocks of your syntax.

The issues were resolved with all the other methods too. The only remaining issue is that the automatic baseline model fails to be fit in the multilevel case, so you currently have to fit your own and pass it to the baseline.model= argument of fitMeasures() if you want incremental indices like CFI. We're working on a solution to that.

andreaphsz commented 5 years ago

Thank you to have investigated in this isssue! And thanky you for reporting it here!

TDJorgensen commented 5 years ago

This modindices() issue has be resolved in the development version of lavaan:

install.packages("lavaan", repos = "http://www.da.ugent.be", type = "source")

That means everything works fine for lavaan.mi as well.

I'll post a final update whenever the baseline.model= issue is resolved, then close this issue.

TDJorgensen commented 4 years ago

I'll post a final update whenever the baseline.model= issue is resolved, then close this issue.

I forget when, but it was resolved :-)

simsem / semTools

Feature request: two-level SEM with random intercepts #39