cjvanlissa / tidySEM

54 stars 7 forks source link

How to fix exogenous variables in OpenMx? #65

Open cjvanlissa opened 1 year ago

cjvanlissa commented 1 year ago

The top example does not work, the bottom one does - but it is inconvenient to specify the variance of exogenous variables by hand:

mod <- as_ram("y1 ~ a*Species
              y1~~b*y1", fixed.x = TRUE)
res <- run_mx(mod, data = df)
mod <- as_ram("y1 ~ a*Species
              y1~~b*y1
              Species ~~ .25*Species", fixed.x = TRUE)
res <- run_mx(mod, data = df)
cjvanlissa commented 1 year ago

@maugavilla this is a real headscratcher for me... do you have ideas?

maugavilla commented 1 year ago

I think a possible solution is to fix the $values slot of the S model matrix for the exogeneours variables to the observed variance in the data. Because right now their variance is not included at all when fixed.x=T, but we need to include it, but as a fixed value.

Not sure where to add it into the as.ram() parser, but something like this

` dat <- iris colnames(dat) <- c("SepalLength","SepalWidth", "PetalLength","PetalWidth","Species")

mod <- as_ram("SepalLength ~ aPetalLength SepalLength ~~bSepalLength", fixed.x = T) mod

mod$S$values["PetalLength","PetalLength"] <- var(dat$PetalLength)

res <- run_mx(mod, data = dat) summary(res)`

Hope this makes sense

cjvanlissa commented 1 year ago

I've implemented a fix that does this, but only for the variance of observed variables. Not sure if that is correct, what do you think?

maugavilla commented 1 year ago

I think this makes sense, and matches how lavaan handles is in several parts. For example, I get the same estimates, number of parameters, df. But the logLik is different. Which makes me think that lavaan ignores the X side of the equations for the logLik estimation. While OpenMx still includes them, leading to the discrepancy in fit but on model parameters

` library(tidySEM)

dat <- iris colnames(dat) <- c("SepalLength","SepalWidth", "PetalLength","PetalWidth","Species")

mod <- as_ram("SepalLength ~ aPetalLength SepalLength ~~bSepalLength", fixed.x = T) mod mod$S

mod$S$values["PetalLength","PetalLength"] <- var(dat$PetalLength)

res <- run_mx(mod, data = dat) summary(res)

library(lavaan)

ss <- sem("SepalLength ~ aPetalLength SepalLength ~~bSepalLength", data=dat, meanstructure=T, fixed.x=T) summary(ss, fit.measures=T) parameterestimates(ss, ci=T)

logLik(res) logLik(ss) `

cjvanlissa commented 1 year ago

Any thoughts about what the consequences of this are for different types of models, and whether we should do anything about this?

maugavilla commented 1 year ago

We would need to adjust the fit measures in OpenMx to ignore the x side. Which I dont think we can, or would be easy. I would suggest to add a warning, when fixed.x =T, saying tha fit indices might be unreliable in OpenMx