yrosseel / lavaan

an R package for structural equation modeling and more
http://lavaan.org
412 stars 99 forks source link

unnecessary refitting following lav_fit_measures_check_baseline() #306

Closed TDJorgensen closed 3 months ago

TDJorgensen commented 8 months ago

This is an odd one, that I noticed causing a problem in lavaan.mi when calculating CFI etc.

When the baseline model and target model do not have all(object@Options$test == fit.indep@Options$test), the baseline model is refitted. This might be totally unnecessary if the reason they differ is that one $test is c("standard", "satorra.bentler"), whereas the other $test is merely "satorra.bentler".

The redundant "standard" is sometimes added--e.g., here--under the condition that the estimator is the default (specified explicitly or not at all) while the test= argument specifies something besides "standard". This was originally needed to assure the standard LRT was provided, even when a residual-based statistic was requested. But when explicitly specifying a scaled (standard) test, the concatenation is unnecessary.

HS.model <- ' visual  =~ x1 + x2 + x3
              textual =~ x4 + x5 + x6
              speed   =~ x7 + x8 + x9 '

fit.e <- cfa(HS.model, data = HolzingerSwineford1939, estimator = "mlm") # the usual way
fit.et <- cfa(HS.model, data = HolzingerSwineford1939, estimator = "mlm",
              test = "satorra.bentler") # redundant
fit.t <- cfa(HS.model, data = HolzingerSwineford1939, test = "satorra.bentler",
              se = "robust.huber.white") # for example, usually paired with MLR
fit.e@Options$test
fit.et@Options$test
fit.t@Options$test # "standard" is unnecessarily concatenated

Although the concatenation of "standard" doesn't have any apparent negative consequences when fitting the model, it might not match how a baseline model was fitted. I don't have an update() method for lavaan.mi objects, so I do something analogous to this:

PTb <- lav_partable_independence(fit.e)
fit.b <- lavaan(model = PTb, data = HolzingerSwineford1939, 
                # ...,
                estimator = fit.e@Options$estimator,
                se = "none", # to save time
                test = fit.e@Options$test)
fitMeasures(fit.e, baseline.model = fit.b, fit.m = "cfi") # warning: refit baseline

Again, this isn't really a problem for lavaan users. They get the same results, and refitting the baseline model will usually be instantaneous (like it is here), but it could be more time-consuming in larger or more complex data situations.

In the lavaan.mi situation, refitting the baseline model yields incorrect results (which are NOT immediately apparent because everything seems to run fine) because the results are based on the first imputation (stored in @Data) rather than pooled results. I will now check the @call for the originally requested test= and estimator= to avoid this, but there might be other scenarios where this yields unexpected problems, so I created the issue to bring it to your attention. I'm not sure what the most efficient solution would be in lavaan, but if you make a suggestion, I could work on a pull request.

yrosseel commented 6 months ago

The added 'standard' was due to the use of opt$test <- union("standard", opt$test) when a standard (non-robust) estimator was provided. If have now removed them, so that "standard" is no longer added. That seems to be part of the solution, but perhaps more steps are needed?