ecmerkle / blavaan

An R package for Bayesian structural equation modeling
https://ecmerkle.github.io/blavaan
87 stars 23 forks source link

Latent variable names missing from blavPredict(type = "lv") #53

Closed bgall closed 2 years ago

bgall commented 2 years ago

ls there any way to add variable names to the output of blavPredict(type = "lv")? Looks like it's not returning them and it can be unclear which column in the output matrix is which latent variable. The base lavPredict() doesn't seem to have this issue. Tried looking around the source code, but would need to dig in deep to find out how naming works and (perhaps falsely) assume this is something that would be pretty easy to address if I already knew my way around.


ibrary(dplyr)
library(blavaan)
set.seed(123)

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Load data
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

data(HolzingerSwineford1939)
dat <- HolzingerSwineford1939

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Specify SEM
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

model <- "

# Measurement Model
lv1 =~ NA*x1 + x2 + x3
lv2 =~ NA*x4 + x5 + x6
lv3 =~ NA*x7 + x8 + x9

# Residual Variances of Latent Factors
lv1 ~~ 1*lv1
lv2 ~~ 1*lv2
lv3 ~~ 1*lv3

# Add some arbitrary covariances
lv1 ~~ lv2
x1 ~~ x4

# Regression
lv1 ~ lv3 + sex + grade + ageyr
lv2 ~ lv3 + sex + grade + ageyr
"

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Estimation
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# Fit model with default priors
m_out <- model %>% blavaan::bsem(data = dat, 
                                 sample = 1000, 
                                 burnin = 1000, 
                                 n.chains = 2,
                                 save.lvs= TRUE, 
                                 seed = 123)

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Post-estimation Issue : blavPredict() dropping variable names
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# In lavaan, lavPredict returns a lavaan.matrix with a column name attribute
# for identifying which column contain's the scores for which factor scores.
m_out %>% lavPredict(type = "lv") %>% str()

# However, In blavaan, blavPredict returns a list of matrices WITHOUT the
# column name attribute. This makes it unclear in many cases which column 
# corresponds to which factor scores.
m_out %>% blavPredict(type = "lv") %>% str()
ecmerkle commented 2 years ago

Thanks, this was an oversight and should be easy to add. In the meantime, you should be able to get the lv names from blavInspect:

lvsamps <- blavPredict(m_out, type = "lv")

lvnames <- colnames(blavInspect(m_out, "lvmeans"))

## add names to blavPredict output:
lvsamps <- lapply(lvsamps, function(x) colnames(x) <- lvnames)
ecmerkle commented 2 years ago

One additional thing here: in lavaan, lavPredict gives you a single factor score per person/lv, arranged in a matrix. In contrast, blavPredict gives you a list where each entry is a matrix representing one posterior sample. If you wanted a single factor score like lavaan (where, in blavaan, the single factor score is the mean of the posterior distribution), see the blavInspect(m_out, "lvmeans") command.