Using reliability when fit includes ordered variables

jknowles commented 7 years ago

Hi! Big fan of lavaan and happy to see it extended with the semTools package. This looks like great work. I'm running into a snag computing reliability on cfa models when the model includes an ordered factor.

The following is a contrived example that doesn't make sense as a model, but illustrates (perhaps) the problem:

data("HolzingerSwineford1939")
# Simple model with ordered variable
HS.model <- ' visual  =~ x1 + x2 + x3 + ageyr '
# Declare ageyr as ordered, fit cfa
fit <- cfa(HS.model, data=HolzingerSwineford1939,
              auto.var=TRUE, auto.fix.first=TRUE, ordered = c("ageyr"),
              auto.cov.lv.x=TRUE)

When I run the reliability function

reliability(fit)

I get:

> reliability(fit)
Error in lav_data_full(data = data, group = group, cluster = cluster,  : 
  lavaan ERROR: missing observed variables in dataset: NA.

I am using the development versions of both semTools and lavaan. In the example I am actually working on I can get reliability to produce output if the measures I include are not included as ordered, but if they are ordered, it fails.

Does the reliability command work on CFA models with ordered categorical variables?

Version info for the packages:

other attached packages:
 [1] semTools_0.4-15.903 lavaan_0.6-1.1118   magrittr_1.5       
 [4] broom_0.4.2         modelr_0.1.0        scales_0.4.1       
 [7] purrr_0.2.2         eeptools_1.0.0      ggplot2_2.2.1      
[10] pROC_1.9.1          knitr_1.15.1        tidyr_0.6.1        
[13] lazyeval_0.2.0      dplyr_0.5.0

TDJorgensen commented 7 years ago

I'm glad you find the package useful, and thanks for the sufficient detail to track down the problem. reliability() works fine when you have multiple ordered indicators of a factor:

model <- ' f1 =~ u1 + u2 + u3 + u4 '
fit <- cfa(model, data = datCat, ordered = c("u1", "u2", "u3", "u4"))
reliability(fit)

But I am unaware of a method for calculating reliability when there is a mixture of continuous and ordinal indicators. I have updated the help page description to say so:

Green and Yang (2009) did not propose a method for calculating reliability with a mixture of categorical and continuous indicators, and we are currently unaware of an appropriate method. Therefore, when reliability detects both categorical and continuous indicators in the model, an error is returned. If the categorical indicators load on a different factor(s) than continuous indicators, then reliability can be calculated separately for those scales by fitting separate models and submitting each to the reliability function.

Green, S. B., & Yang, Y. (2009). Reliability of summed item scores using structural equation modeling: An alternative to coefficient alpha. Psychometrika, 74(1), 155-167. doi:10.1007/s11336-008-9099-3

If you find a reference extending Green & Yang's formula 21 to calculate the covariance between a continuous item and a categorical item, please let me know and I will put it on my to-do list.

jknowles commented 7 years ago

Thank you, that's great @t129j179

simsem / semTools

Using reliability when fit includes ordered variables #20