Closed Suppanut closed 11 months ago
You might be confusing two different steps:
Shouldn't we invert NACOV last
We invert first because the inverse of a diagonal matrix is just taking the reciprocal of diagonal elements. We need to invert the full matrix, then save those diagonal values. This only happens once (per "block": group and/or level) when pooling within- and between-imputation components of (N)ACOV, which is then passed to the lavaan(..., WLS.V=)
argument.
to avoid the inversion of a full matrix in DWLS?
That is happening during estimation, which is after the pooling of Sigma and its NACOV. There is no no avoiding the need to evaluate Equation 9 (Wirth & Edwards, 2009) every time the point estimates are updated, but at that point, the diagonal matrix W is easy to invert because it is just the reciprocals of the diagonal.
I posed the question because I expected that analyzing copies of complete data (which initially have no missing values) using lavaan.mi
should yield results identical to those obtained from analyzing a single complete dataset with lavaan
. However, the results differ.
Please check this example
library(lavaan.mi)
HS <- HolzingerSwineford1939[ , paste0("x", 1:9)]
imps <- list(HS,HS,HS,HS,HS) # 5 copies of the original data
HS3cat <- lapply(imps, function(x) {
as.data.frame( lapply(x, cut, breaks = 3, labels = FALSE) )
})
HS.model <- '
visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
'
## pool polychoric correlations and thresholds
prePooledData3 <- poolSat(HS3cat, ordered = paste0("x", 1:9))
fitc <- cfa(HS.model, data = prePooledData3, std.lv = TRUE)
lavTest(fitc, test = "browne.residual.adf", output = "text")
# Model Test User Model:
#
# Browne's residual-based (ADF) test
# Test statistic 47.835
# Degrees of freedom 24
# P-value (Chi-square) 0.003
fit <- sem(HS.model, HS3cat[[1]],
ordered = TRUE,
std.lv = TRUE,
sample.cov.rescale = FALSE,
fixed.x = FALSE,
conditional.x = FALSE,
estimator = "DWLS",
se = "robust.sem",
test = "Browne.residual.adf")
lavTest(fit, test = "browne.residual.adf", output = "text")
# Model Test User Model:
#
# Browne's residual-based (ADF) test
# Test statistic 48.097
# Degrees of freedom 24
# P-value (Chi-square) 0.002
all.equal(fitMeasures(fitc), fitMeasures(fit)) # Mean relative difference: 0.4973422
all.equal(inspect(fitc, "WLS.V"), inspect(fit, "WLS.V")) # "Mean relative difference: 0.534187"
# Modify WLS.V
prePooledData3$WLS.V <- solve(diag(diag(prePooledData3$NACOV)))
fitc2 <- cfa(HS.model, data = prePooledData3, std.lv = TRUE)
lavTest(fitc2, test = "browne.residual.adf", output = "text")
# Model Test User Model:
#
# Browne's residual-based (ADF) test
# Test statistic 48.097
# Degrees of freedom 24
# P-value (Chi-square) 0.002
all.equal(fitMeasures(fit), fitMeasures(fitc2)) # TRUE
all.equal(inspect(fit, "WLS.V"), inspect(fitc2, "WLS.V")) # TRUE
Do you have any thoughts on this?
Ah, this is a nice empirical demonstration without between-imputation variance. I should have thought of that. Thank you!
Looking back at the source material, Muthén (1997, p. 16) does state that W should be the diagonal of Gamma (NACOV
), which is then inverted. So yes, you were right, and that explains why your suggestion works. I will update that to close this issue.
Shouldn't we invert NACOV last to avoid the inversion of a full matrix in DWLS?
solve(diag(diag(NACOV[[g]])))
https://github.com/TDJorgensen/lavaan.mi/blob/b4a106823a83d9f6fd635319073a77ecd5d18755/R/poolSatMod.R#L366