simsem / semTools

Useful tools for structural equation modeling
74 stars 36 forks source link

`discriminantValidity`: Set `se = "none"` in `constrainedModels` #123

Closed sfcheung closed 1 year ago

sfcheung commented 1 year ago

Hi, @mronkko , may I suggest adding se = "none" in the function constrainedModels() defined inside discriminantValidity()?

It is nice that discriminantValidity() supports bootstrap CI. However, bootstrapping will be done every time the lavaan object supplied is updated, including the calls to constrainedModels(). This can be time consuming when the number of factors, and hence the number of correlations, is large.

To my limited understanding of this function, constrainedModels() is used to do the likelihood ratio tests. Standard errors are not needed in this step.

How about this minor change:

https://github.com/sfcheung/semTools/blob/78bc7c8d90485e19b2d08225fd7c2b3e3e6b6f92/semTools/R/discriminantValidity.R#L221-L222

    lavaan::update(object, model = thisPt[,1:12],
                   se = "none")

This is an illustration:

library(semTools)
library(lavaan)

mod_p <-
"
f1 =~ .8 * x1 + .8 * x2 + .8 * x3
f2 =~ .7 * x4 + .7 * x5 + .7 * x6
f3 =~ .7 * x7 + .8 * x8 + .9 * x9
f1 ~~ .500*f2
f1 ~~ .600*f3
f2 ~~ .700*f3
"

dat <- simulateData(mod_p, sample.nobs = 200, seed = 45351,
                    empirical = TRUE)

mod <-
"
f1 =~ x1 + x2 + x3
f2 =~ x4 + x5 + x6
f3 =~ x7 + x8 + x9
"

# std.lv = FALSE is intentional, for testing
fit <- cfa(mod, dat, std.lv = FALSE,
           se = "boot", bootstrap = 1000, iseed = 987541)

The following are the results using the current version. The warning regarding bootstrapping can be ignored because they do not affect the model chi-square; they are merely for estimating the SEs in the constrained modes. It took nearly 50 seconds because bootstrapping for SEs were repeated three times.

> # The original version
> system.time(dv <- discriminantValidity(fit))
Some of the latent variable variances are estimated instead of fixed to 1. The model is re-estimated by scaling the latent variables by fixing their variances and freeing all factor loadings.
   user  system elapsed
  44.41    2.15   48.24
Warning messages:
1: In lav_model_nvcov_bootstrap(lavmodel = lavmodel, lavsamplestats = lavsamplestats,  :
  lavaan WARNING: 153 bootstrap runs resulted in nonadmissible solutions.
2: In lav_model_nvcov_bootstrap(lavmodel = lavmodel, lavsamplestats = lavsamplestats,  :
  lavaan WARNING: 46 bootstrap runs resulted in nonadmissible solutions.
3: In lav_model_nvcov_bootstrap(lavmodel = lavmodel, lavsamplestats = lavsamplestats,  :
  lavaan WARNING: 3 bootstrap runs resulted in nonadmissible solutions.
> dv
  lhs op rhs est  ci.lower  ci.upper Df      AIC      BIC     Chisq Chisq diff
1  f1 ~~  f2 0.5 0.2794114 0.7182059 25 5697.655 5763.621 21.133634  21.133634
2  f1 ~~  f3 0.6 0.3959094 0.7850434 25 5692.617 5758.583 16.095593  16.095593
3  f2 ~~  f3 0.7 0.5159835 0.8674631 25 5682.750 5748.716  6.228683   6.228683
      RMSEA Df diff   Pr(>Chisq)
1 0.3172825       1 4.283436e-06
2 0.2747325       1 6.022401e-05
3 0.1616892       1 1.256972e-02
>

The following are the results with the proposed change. The p-values for the model chi-square are identical to those in the current version. No warning messages because bootstrapping was not done when fitting the constrained model. It took only 12 seconds. Bootstrapping is still repeated because I intentionally set std.lv = FALSE when fitting model, such that discriminantValidity() refit the model, which is necessary in this case.

> # With se = "none" in constrainedModels
> system.time(dv <- discriminantValidity(fit))
Some of the latent variable variances are estimated instead of fixed to 1. The model is re-estimated by scaling the latent variables by fixing their variances and freeing all factor loadings.
   user  system elapsed 
  11.30    0.67   12.38
> dv
  lhs op rhs est  ci.lower  ci.upper Df      AIC      BIC     Chisq Chisq diff
1  f1 ~~  f2 0.5 0.2794114 0.7182059 25 5697.655 5763.621 21.133634  21.133634
2  f1 ~~  f3 0.6 0.3959094 0.7850434 25 5692.617 5758.583 16.095593  16.095593
3  f2 ~~  f3 0.7 0.5159835 0.8674631 25 5682.750 5748.716  6.228683   6.228683
      RMSEA Df diff   Pr(>Chisq)
1 0.3172825       1 4.283436e-06
2 0.2747325       1 6.022401e-05
3 0.1616892       1 1.256972e-02

My two cents.

mronkko commented 1 year ago

Fixed in https://github.com/simsem/semTools/pull/124

sfcheung commented 1 year ago

Thanks a lot for your quick reply!