tbates / umx

Making Structural Equation Modeling (SEM) in R quick & powerful
https://tbates.github.io/
44 stars 17 forks source link

umxGxE improve error if definition variable (moderator) is a factor #196

Closed salvatoj closed 2 years ago

salvatoj commented 2 years ago

I would like to use umxGxE with a binary definition variable (moderator), and a continuous outcome. However, when I apply MxFactor to my definition variable, the model runs but throws an error ("Error incurred trying to run umxSummary non-numeric argument to binary operator"). I then tried to see if I got the same error if I coded the definition variable as numeric (0/1). This resulted in no error, but I'm concerned that the variable type is not being treated correctly in the model. Might there be any advice on how to proceed?

Reproducible example with toy data below -

# coding obese with mxFactor (normal vs. obese)
data(twinData)
# Cut to form category of 20% obese subjects and make into mxFactors (ensure ordered is TRUE, and require levels)
cutPoints <- quantile(twinData[, "bmi1"], probs = .2, na.rm = TRUE)
obesityLevels = c('normal', 'obese')
twinData$obese1 <- cut(twinData$bmi1, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels)
twinData$obese2 <- twinData$obese1
ordDVs = c("obese1", "obese2")
twinData[, ordDVs] <- mxFactor(twinData[, ordDVs], levels = obesityLevels)
selDVs = "wt"
selDefs = "obese"
mzData <- subset(twinData, zygosity == "MZFF")
dzData <- subset(twinData, zygosity == "DZFF")
m1 = umxGxE(selDVs= "wt", selDefs= "obese", sep= "", dzData= dzData, mzData= mzData, tryHard= "yes", dropMissingDef = TRUE)

# coding obese as 0/1 numeric rather than as mxFactor
twinData$obese_num1<-ifelse(twinData$obese1 == "normal", 0, 1)
twinData$obese_num2<-twinData$obese_num1
selDVs = "wt"
selDefs = "obese_num"
mzData <- subset(twinData, zygosity == "MZFF")
dzData <- subset(twinData, zygosity == "DZFF")
m1 = umxGxE(selDVs= "wt", selDefs= "obese_num", sep= "", dzData= dzData, mzData= mzData, tryHard= "yes", dropMissingDef = TRUE)
mcneale commented 2 years ago

Hi Jessica

It is not a good idea to make an ordered factor out of a definition variable. Essentially, whatever value it has will be used in the model to define or adjust a path coefficient. If it's a factor, it won't know what to do with it. Evidently, this is an issue that could use a more informative error message...

salvatoj commented 2 years ago

Thank you Mike! Very helpful to know. I will proceed with the binary 0/1 coding.

tbates commented 2 years ago

The error is from umxPlotGxE(m1) I guess the right thing to do is to stop umxGxE coping with the factor input to catch this earlier on, so I did that now.

mzData = subset(twinData, zygosity == "MZFF")
> dzData = subset(twinData, zygosity == "DZFF")
> m1 = umxGxE(selDVs= "wt", selDefs= "obese", sep= "", dzData= dzData, mzData= mzData, tryHard= "yes", dropMissingDef = TRUE)

Error: Definition vars (selDefs) must be numeric (not, e.g. factor)