umxACE, standardized estimates not standardized

tbates / umx

Making Structural Equation Modeling (SEM) in R quick & powerful

https://tbates.github.io/

44 stars 17 forks source link

umxACE, standardized estimates not standardized #151

Closed jpritikin closed 3 years ago

jpritikin commented 3 years ago

umx 4.3

library(umx)

data(twinData) # ?twinData from Australian twins.
twinData[, c("ht1", "ht2")] = twinData[, c("ht1", "ht2")] * 10
mzData = twinData[twinData$zygosity %in% "MZFF", ]
dzData = twinData[twinData$zygosity %in% "DZFF", ]
m1 = umxACE(selDVs = "ht", selCovs = "age", sep = "", dzData = dzData, mzData = mzData)

sum(m1$top$a_std$result,
    m1$top$c_std$result,
    m1$top$e_std$result)  # not 1.0

m2 = umxACEv(selDVs = "ht", selCovs = "age", sep = "", dzData = dzData, mzData = mzData)
sum(m2$top$A_std$result,
    m2$top$C_std$result,
    m2$top$E_std$result) # 1.0

Looks like the umxACE formulas use SD %*% a instead of SD %&% a.
Why the naming difference? Why use lowercase a_std and uppercase A_std?

tbates commented 3 years ago

a is for paths, A is for Variance (case change is a pretty standard nomenclature for paths and variances)

Variance is what is being standardized, hence summing to 1 (a_std^2 + c_std^2 + e_std^2 sums to 1).

0.929^2+ 0.082^2+ 0.36^2 = 1

jpritikin commented 3 years ago

OK, maybe you can add the missing lower & uppercase algebras so it is easy to switch between umxACE and umxACEv?

tbates commented 3 years ago

For umxACE, adding algebras to compute A_std etc. would mean they get computed every time, so would slow execution (unless there's a way to tell an algebra to compute only once at model completion?)

But also, Is there a use case for wanting, e.g. A_std from an ACE model? The model paths are in the lower-triangle a, c, e matrices, and don't survive the a %*% t(a) algebras except in the univariate 1*1 case.

For umxACEv it doesn't contain any paths/lower-case variables.

jpritikin commented 3 years ago

adding algebras to compute A_std etc. would mean they get computed every time, so would slow execution (unless there's a way to tell an algebra to compute only once at model completion?)

Nope, the algebras are only computed twice: once at the beginning and once at the end,

library(umx)

data(twinData) # ?twinData from Australian twins.
twinData[, c("ht1", "ht2")] = twinData[, c("ht1", "ht2")] * 10
mzData = twinData[twinData$zygosity %in% "MZFF", ]
dzData = twinData[twinData$zygosity %in% "DZFF", ]
m1 = umxACE(selDVs = "ht", selCovs = "age", sep = "", dzData = dzData, mzData = mzData,
              autoRun=FALSE)

m1 <- mxModel(m1, mxModel(m1$top, mxAlgebra(SD %&% A, name="A_std", verbose=2L)))
m1 <- mxRun(m1)

For umxACEv it doesn't contain any paths/lower-case variables.

Yeah, that's true. I didn't really think it through.

mcneale commented 3 years ago

Is that because certain matrices and algebras do not affect the fit function? Either because they are in an mxModel that has no data and there are no label-based equality constraints, nor any uses of the matrices and algebras by mxModels that have data? Getting something like this wrong would be bad… getting it right is definitely more efficient.

jpritikin commented 3 years ago

Is that because certain matrices and algebras do not affect the fit function?

Yeah, exactly

tbates commented 3 years ago

Good to know, re algebras! (slightly amazed that can be automated) umxACE (on GitHub) now has:

mxAlgebra(name = "A_std", SD %&% A), # standardized A
# ...

in addition to the existing.

mxAlgebra(name = "a_std", SD %*% a), # standardized a

jpritikin commented 3 years ago

Awesome! Thank you.