tbates / umx

Making Structural Equation Modeling (SEM) in R quick & powerful
https://tbates.github.io/
44 stars 17 forks source link

What to name a definition variable in RAM model? #107

Closed tbates closed 5 years ago

tbates commented 5 years ago

In RAM, a definition variable "var" is implemented as a latent with mean and variance fixed at zero, and labels= "data.var".

This leaves open to the user what to name the latent.

It's legal in RAM to name the latent "var", but that makes it impossible to also use var as a manifest.

I thinking to make it that in umx, definition variables by default get named "defvar" (i.e., prefix with "def")

So umxPath(defn = "org1") would create:

mxPath(from = "def_org1", values=0, arrows= 2, free=FALSE)
mxPath(from = "one", to = "def_org1", values=0, free=FALSE, label='data.org1')

But the user can override this with any legal name other than the name of the variable to avoid clashing with manifest space)

In this case the user would specify both the label and the name for the latent.

umxPath(defn = "my_org1_defvar", labels = "org1") would create:

mxPath(from = "my_org1_defvar", values=0, arrows= 2, free=FALSE)
mxPath(from = "one", to = "my_org1_defvar", values=0, free=FALSE, label='data.org1')

thoughts?

mcneale commented 5 years ago

I think conceptualizing definition variables as latent and perhaps exogenous is consistent with their role used as covariates. It is an (perhaps the only) appropriate way to model covariates for ordinal data; with continuous measures it can be simpler to merely regress them out before model fitting. However, definition variables are really placed on paths (not variables) in every case, so if we want to, e.g., moderate a relationship between two latent variables, it can be very useful.

I worry that the automagical approach considers them in too a narrow context as being like covariates, when they are treated quite differently from an observed variable in SEM, including that they require listwise deletion should their values be missing.

tbates commented 5 years ago

This is just for RAM, where (correct me if wrong) they have to be brought in via a dummy latent variable... there's no current alternative, is there?

In #126 in OpenMx I suggested we add definition variable to the RAM spec, but after a lot of discussion, it wasn't supported, I think? But perhaps this revives that notion, or goes in a new direction?

mcneale commented 5 years ago

Definition variables can be part of RAM spec, but exclusively via path labeling - consistent with the way it works in matrix form. So I think in the end there wasn't anything extra needed.

Now, arrows=0 is something that IMO would be useful to add, I need to draft a proposal for that.

mcneale commented 5 years ago

I think it should work via path labeling. But perhaps we need a Definition Age to go with Latents x y z Observed p q r

tbates commented 5 years ago

Adding definitionVars in addition to latentVars and manifestVars) was the thought, but over to OpenMx. Take some thinking to allow variable to both drive a defVar, and appear in manifests when so required.

I'll notify users about rows dropped due to listwise deletion (already do this for GxE and it's very useful). Will allow arbitrary name for def vars, defaulting to def_varName

cheerios, t

mcneale commented 5 years ago

Makes sense. I think the straightforward way to have a definition variable operate as observed and as definition is to just duplicate it in the dataset mydataframe$ageDummy <- mydataframe$age which is crude and wasteful of memory, but ought to work fine.

mcneale commented 5 years ago

According to the team, definition variables work as long as i) they are in the dataset supplied (don't have to be in Manifest list) and specified as definition variables as, e.g., data.age. Plus, they can be re-used as manifest variables if so listed. So I think this is fully functional. Minus the dropping of rows.

tbates commented 5 years ago

Agreed... this whole github issue was merely about how umxPath should implement generating a definition variable and whether to autoname them. So we're all on the same page, I believe, and umxPath(defn=) is working well now, including with variables used as a manifest and also as a data.var

I started a blog post on this... not finished yet.

http://tbates.github.io/advanced/1995/03/10/detailed-Definition-variables-in-RAM.html