tbates / umx

Making Structural Equation Modeling (SEM) in R quick & powerful
https://tbates.github.io/
44 stars 17 forks source link

Make diagrams with copaths #140

Closed mcneale closed 3 years ago

mcneale commented 3 years ago

OpenMx is releasing selection paths, which are headless arrows. (arrows=0). If these could be drawn, it would be really useful.

tbates commented 3 years ago

Sounds interesting! Want to drop a script for a really minimalist example model here?

tbates commented 3 years ago

So here's a minimalist script... Questions:

Q1: How should the umxPath('V1', 'V2', arrows=0, values = .2) path be drawn?

Q2: Also, how does one map the value for that path back out of it's apparent storage in the model (t2$selectionVector). That matrix has no labels... (is that an OpenMx bug?)

  1. Make a model to generate data out of with 3 variables, correlating 0.166
t1 = mxModel("t1",
  umxMatrix("c1", "Symm", 3,3, FALSE, values=diag(3)+.2),
  # var 2,var1 cov left free in c2
  umxMatrix("c2", "Symm", 3,3, free=c(F,T,rep(F,4)), values=diag(3)+.2),
  umxMatrix("m1", "Full", 3, 1, values=.1),
  mxAlgebra(name="c3", mxPearsonSelCov(origCov= c1, newCov = c2)),
  mxAlgebra(name="m2", mxPearsonSelMean(origCov= c1, newCov = c2, origMean = m1)),
  mxExpectationNormal("c3", means = "m2"), mxFitFunctionML()
)

# check the data
dat = mxGenerateData(t1, nrows = 800)
umxAPA(dat)
V1 V2 V3
V1 1 (0)
V2 0.15 (0.03) 1 (0)
V3 0.19 (0.03) 0.17 (0.03) 1 (0)
Mean (SD) 0.11 (1.11) 0.09 (1.13) 0.14 (1.11)

Now induce some Aitkin-Pearson selection by setting cov(V2, V1) to -.2 in the "c2" sample covariance matrix (it was .2)

trueSelCor = -.2
t1 = omxSetParameters(t1, "c2_r2c1", values=trueSelCor)

dat = mxGenerateData(t1, nrows = 800)
umxAPA(dat)
V1 V2 V3
V1 1 (0)
V2 -0.17 (0.03) 1 (0)
V3 0.1 (0.03) 0.14 (0.03) 1 (0)
Mean (SD) 0.1 (1.13) 0.11 (1.1) 0.08 (1.16)
  1. Now umxRAM model the selected data
manifests = paste0('V',1:3)
t2 = umxRAM("t2", data = dat, tryHard="yes",
  umxPath(v.m. = manifests),
  umxPath(unique.bivariate = manifests),
  umxPath('V1','V2', arrows=0, values = .2)
)

Figure: means not shown

Aitkin, (means not shown)
mcneale commented 3 years ago

Q1 Copaths have no arrowheads at all - just a line connecting the relevant objects. This may include an autocorrelation path, although I have yet to use such a model. Such "Autoselection" paths could be used to represent a variance change imposed on a variable. Q2 Usually, it would be a free parameter in the model; if it was set to zero there is no need to show it. I see limited, if any, use in fixing it to a non-zero value, but I suppose a Bayesian with a big prior and no changed posterior(!) might go there :). I will ask Joshua about its storage for such cases.

jpritikin commented 3 years ago

Also, how does one map the value for that path back out of it's apparent storage in the model (t2$selectionVector). That matrix has no labels...

The information is part of the expectation. For example, given this model,

t2 <- mxModel(
  "t2", type="RAM",
  manifestVars = paste0('V',1:3),
  mxPath(paste0('V',1:3), arrows=2, values=1.2, free=FALSE),
  mxPath(paste0('V',1:3), arrows=2,
         connect = "unique.bivariate", values=.2, free=FALSE),
  mxPath('one',  paste0('V',1:3), values=.1, free=FALSE),
  mxPath('V1','V2', arrows=0, values=.2),
  mxData(dat, 'raw'))

the follow information is available from the expectation,

> t2$expectation$selectionPlan
  step from to
1    1   V1 V2
> t2$expectation$selectionVector
[1] "selectionVector"

The selectionVector slot names the vector where the values are stored. The selectionPlan is a data.frame with three columns. I guess these edges are not directional so from and to are exchangeable. The step column indicates the order of application. There can be multiple arrows per step, but all arrows within a step are applied simultaneously. I'm not sure whether you'd want to try to draw the step as well as the arrow.

mcneale commented 3 years ago

Maybe color or dotted styles (or both) could be used for different steps? It’s a rare thing today for such complexities to be needed, but the future is uncertain here. Note that network diagram drawings often have such ‘undirected’ paths, albeit with values at inverse of what this model would obtain in the saturated everything-associates-with-everything situation.

tbates commented 3 years ago

TODO:

  1. Check model expectation for non-empty model$expectation$selectionPlan.
  2. Draw each found "from-to" pair as a headless arrow on the diagram.
    • Where the values are stored is contained in model$expectation$selectionVector.
    • model[["selectionVector"]]$values contains a matrix of values to label these paths found in the selectionPlan.
tbates commented 3 years ago

Question: What (if anything) should happen if the user standardises the model? Just say "sorry", or apply some code (what?), or selectionVector is unchanged/already standardized?

tbates commented 3 years ago

howz'at? (color = RGB for step 123) always dotted to distinguish.

plot(m1, means=F)
Screenshot 2021-03-06 at 01 50 03 1
mcneale commented 3 years ago

A solid line with no arrows is the convention for a copath, albeit a rarely-used one. Standardization is an interesting question. Possibly, pre- and post-multiplication with a diagonal matrix of the reciprocals of the SDs of the variables that a copath connects would do the right thing. A crude way to check would be to try the model with unstandardized variables, then standardize them before analysis. Working through the selection algebra might corroborate, but it’s important to note that covariance due to other sources (one- and two-headed arrows) has already happened, so whatever metrics the variables are in are the ones on which selection formula operates.

tbates commented 3 years ago

un-dotted. will play with std

Screen Shot 2021-03-07 at 4 15 13 pm
trueSelCor = -.2
t1 = mxModel("t1",
  umxMatrix("c1", "Symm", 3,3, FALSE, values=diag(3)+.2),
  umxMatrix("c2", "Symm", 3,3, free=c(F,T,rep(F,4)), values=diag(3)+.2),
  umxMatrix("m1", "Full", 3, 1, values=.1),
  mxAlgebra(name="c3", mxPearsonSelCov(origCov= c1, newCov = c2)),
  mxAlgebra(name="m2", mxPearsonSelMean(origCov= c1, newCov = c2, origMean = m1)),
  mxExpectationNormal("c3", means = "m2"), mxFitFunctionML()
)
t1 = omxSetParameters(t1, "c2_r2c1", values=trueSelCor)

t2 = umxRAM("t2", data = mxGenerateData(t1, nrows = 800), tryHard="yes",
  umxPath(v.m. = c("V1","V2","V3")),
  umxPath(unique.bivariate = c("V1","V2","V3")),
  umxPath('V1','V2', arrows=0, values = .2)
)
plot(t2)