tbates / umx

Making Structural Equation Modeling (SEM) in R quick & powerful
https://tbates.github.io/
45 stars 17 forks source link

family design #130

Open ldasey opened 4 years ago

ldasey commented 4 years ago

Hey,

umx is great, thank you very much. I have a question: Will you extend umx to support "basic" family models? That is, MZ and DZ twins, possibly their parents and maybe one non-twin sibling. My aim is to: differ C further, estimate C and D at the same time and get some grip on assortative mating (instead of just guessing dzAr).

Thank you very much.

tbates commented 4 years ago

Hi @bububox. I'd have some interest in this direction. You're welcome also to generate code in that direction for inclusion here.

mcneale commented 4 years ago

The modeling of assortative mating - particularly phenotypic - is made much easier with the advent of mxPearsonSelCov() function. Essentially, use RAM to do all the other parts of the model, including possible P->E transmission from parent to child (your differentiation of C), then use mxPearsonSelCov to change the covariances of the parents' phenotypes, and to calculate the covariances among all the non-selected latent and observed variables. I have an example here. Sorry that it is unnecessarily long-winded, but the script was built by Onyx, and it seems to work. All the NAs seem like it's gone bananas though :).

# This model specification was automatically generated by Onyx

require("OpenMx");
modelData <- diag(4)
mzData <- diag(4)
dzData <- diag(4)
dummyMeanVector <- rep(0,4); 

manifests<-c("MomMZ","DadMZ","MZ1","MZ2")
latents<-c("AMZM","EMZM","AMZD","EMZD","AMZ1","EMZ1","AMZ2","EMZ2","CDZ")
allVarNames <- c(manifests,latents)

dimnames(modelData) <- list(manifests,manifests)
dimnames(mzData) <- list(manifests,manifests)
dimnames(dzData) <- list(manifests,manifests)
names(dummyMeanVector) <- manifests

nManifests <- length(manifests)
nLatents <- length(latents)
nTotalVars <- nManifests + nLatents

beforeAssort <- mxModel("beforeAssort", 
type="RAM",
manifestVars = c("MomMZ","DadMZ","MZ1","MZ2"),
latentVars   = c("AMZM","EMZM","AMZD","EMZD","AMZ1","EMZ1","AMZ2","EMZ2","CDZ"),
mxMatrix("Full", nrow=13,ncol=13,values=c(
0.0,0.0,0.0,0.0,0.4414220691543569,1.0286622878648066,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.4414220691543569,1.0286622878648066,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4414220691543569,1.0286622878648066,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4414220691543569,1.0286622878648066,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.5,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,
0.2908209056475054,0.2908209056475054,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,
0.0,0.0,0.0,0.0,0.5,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,
0.2908209056475054,0.2908209056475054,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
), free=c(
F,F,F,F,T,T,F,F,F,F,F,F,F,
F,F,F,F,F,F,T,T,F,F,F,F,F,
F,F,F,F,F,F,F,F,T,T,F,F,F,
F,F,F,F,F,F,F,F,F,F,T,T,F,
F,F,F,F,F,F,F,F,F,F,F,F,F,
F,F,F,F,F,F,F,F,F,F,F,F,F,
F,F,F,F,F,F,F,F,F,F,F,F,F,
F,F,F,F,F,F,F,F,F,F,F,F,F,
F,F,F,F,F,F,F,F,F,F,F,F,F,
T,T,F,F,F,F,F,F,F,F,F,F,F,
F,F,F,F,F,F,F,F,F,F,F,F,F,
T,T,F,F,F,F,F,F,F,F,F,F,F,
F,F,F,F,F,F,F,F,F,F,F,F,F
),labels=c(
NA,NA,NA,NA,"a","e",NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,"a","e",NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,"a","e",NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,"a","e",NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
"z","z",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
"z","z",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
), dimnames=list(allVarNames,allVarNames), byrow=TRUE, name="A"),
mxMatrix("Full", nrow=13,ncol=13,values=c(
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,1.0,-0.27,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,-0.27,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,1.0,-0.27,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,-0.27,1.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,.1,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.8,0.0,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,.1,0.0,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.8,0.0,
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4141608825809656
), free=c(
F,F,F,F,F,F,F,F,F,F,F,F,F,
F,F,F,F,F,F,F,F,F,F,F,F,F,
F,F,F,F,F,F,F,F,F,F,F,F,F,
F,F,F,F,F,F,F,F,F,F,F,F,F,
F,F,F,F,F,T,F,F,F,F,F,F,F,
F,F,F,F,T,F,F,F,F,F,F,F,F,
F,F,F,F,F,F,F,T,F,F,F,F,F,
F,F,F,F,F,F,T,F,F,F,F,F,F,
F,F,F,F,F,F,F,F,T,F,F,F,F,
F,F,F,F,F,F,F,F,F,T,F,F,F,
F,F,F,F,F,F,F,F,F,F,T,F,F,
F,F,F,F,F,F,F,F,F,F,F,T,F,
F,F,F,F,F,F,F,F,F,F,F,F,T
),labels=c(
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,"s",NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,"s",NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,"s",NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,"s",NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,"VAR_AMZ",NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,"VAR_EMZ",NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,"VAR_AMZ",NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,"VAR_EMZ",NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,"VC"
), dimnames=list(allVarNames,allVarNames), byrow=TRUE, name="S"),
mxMatrix("Full", nrow=4, ncol=13, values=c(
1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
), free=c(
F,F,F,F,F,F,F,F,F,F,F,F,F,
F,F,F,F,F,F,F,F,F,F,F,F,F,
F,F,F,F,F,F,F,F,F,F,F,F,F,
F,F,F,F,F,F,F,F,F,F,F,F,F
),labels=c(
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
), byrow=TRUE, dimnames=list(manifests,allVarNames), name="F"),
mxMatrix("Full", nrow=1, ncol=13, dimnames=list("Mean",allVarNames), values=c(
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
), free=c(
F,F,F,F,F,F,F,F,F,F,F,F,F
),labels=c(
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
), byrow=TRUE, name="M"),
mxData(modelData, type = "cov", numObs=1, means=dummyMeanVector),
mxExpectationRAM("A","S","F","M"), mxFitFunctionML()
)

# Bound a and e parameters
beforeAssort$S$lbound[13,13] <-0
beforeAssort$S$lbound[12,12] <-0
beforeAssort$S$lbound[10,10] <-0
beforeAssort$S$lbound[11,11] <-0
beforeAssort$S$lbound[ 9, 9] <-0
beforeAssort$S$lbound["AMZ1","AMZ1"] <-0
beforeAssort$S$lbound["AMZ2","AMZ2"] <-0

beforeAssort$A$lbound[1, 5] <-0
beforeAssort$A$lbound[1, 6] <-0
beforeAssort$A$lbound[2, 7] <-0
beforeAssort$A$lbound[2, 8] <-0
beforeAssort$A$lbound[3, 9] <-0
beforeAssort$A$lbound[3,10] <-0
beforeAssort$A$lbound[4,11] <-0
beforeAssort$A$lbound[4,12] <-0
# For MZs this element gets an a parameter too
beforeAssort$A$lbound[3,11] <-0

# Second group for MZs
MZGroup <- mxRename(beforeAssort, newname="MZGroup")
# Alter the model so that i) T1 genotype doesn't cause T1
MZGroup$A$values["MZ1","AMZ1"] <- 0
MZGroup$A$free["MZ1","AMZ1"] <- F
MZGroup$A$labels["MZ1","AMZ1"] <- "not_a"
# And that ii) T2's genotype does instead, giving rG=1 for T1-T2
MZGroup$A$free["MZ1","AMZ2"] <- T
MZGroup$A$labels["MZ1","AMZ2"] <- "a"
MZGroup$A$values["MZ1","AMZ2"] <- MZGroup$A$values["MZ2","AMZ2"]

# Now set up a few objects to help specify the assortative mating part
offDiag <- mxMatrix("Full",2,2,values=c(0,1,1,0),name="offDiag")
nullBlock <- mxMatrix("Zero", nLatents, nLatents, name="nullBlock")
nullOffBlock <- mxMatrix("Zero", nManifests, nLatents, name="nullOffBlock")
DMatrix <- mxMatrix("Full", nTotalVars, nTotalVars, dimnames=list(allVarNames,allVarNames), name="DMatrix")
DMatrix$free["MomMZ","DadMZ"] <- F
DMatrix$labels["MomMZ","DadMZ"] <- "mu"
DMatrix$free["DadMZ","MomMZ"] <- F
DMatrix$labels["DadMZ","MomMZ"] <- "mu"
#DAlgebra <- mxAlgebra(rbind(cbind(DMatrix,nullOffBlock),cbind(t(nullOffBlock),nullBlock)), name="DAlgebra")
DAlgebra <- mxAlgebra(DMatrix, name="DAlgebra")
Imat <- mxMatrix("Iden",nTotalVars,nTotalVars,name="I")
assortmentList <- list(offDiag, DMatrix, DAlgebra, Imat, nullBlock, nullOffBlock)

# Third group for assortment bit MZ post-assortment expected covariances and MZ data go here
algebraModelMZ <- 
 mxModel("AssortMZ", assortmentList, 
         ## Algebras to work out covariances after selection
         mxAlgebra((solve(I-MZGroup.A) %&% MZGroup.S),name="covObsLat", dimnames=list(allVarNames,allVarNames)), # Covariance of observed and latents
         mxAlgebra( mxPearsonSelCov(covObsLat,(covObsLat+DAlgebra)), dimnames=list(allVarNames,allVarNames), name="postAssortmentCovAll"), # Post Assortment both obs & lat
         mxAlgebra(MZGroup.F %&% postAssortmentCovAll, name="postAssortmentCov", dimnames=list(manifests, manifests)), # Filter out the latent variables from post assortment covs
         mxAlgebra(t(MZGroup.F %*% (solve(I-MZGroup.A) %*% t(MZGroup.M))), name="preAssortmentMeans"), # Means (no selection)
         mxExpectationNormal(covariance = "postAssortmentCov", means = "preAssortmentMeans", dimnames = manifests),
         ## Constraints: Residual Genetic, Residual Env, G-E Covariance to specify equilibrium
         mxConstraint(MZGroup.S["AMZM","EMZM"] == postAssortmentCovAll["AMZ1","EMZ1"], name="constrainrGE"),
         mxConstraint(postAssortmentCovAll["AMZ2","AMZ2"] == postAssortmentCovAll["AMZM","AMZM"], name="constrainVG"),
         mxConstraint(postAssortmentCovAll["EMZ1","EMZ1"] == postAssortmentCovAll["EMZM","EMZM"], name="constrainVE"),
         mxFitFunctionML(), 
         mxData(observed=mzData, type="cov", numObs=500, means=dummyMeanVector)
        )

# Fourth group for assortment bit DZ post-assortment expected covariances and MZ data go here
algebraModelDZ <- 
 mxModel("AssortDZ", assortmentList, 
         ## Algebras to work out covariances after selection
         mxAlgebra((solve(I-beforeAssort.A) %&% beforeAssort.S),name="covObsLat", dimnames=list(allVarNames,allVarNames)), # Covariance of observed and latents
         mxAlgebra( mxPearsonSelCov(covObsLat,(covObsLat+DAlgebra)), dimnames=list(allVarNames,allVarNames), name="postAssortmentCovAll"), # Post Assortment both obs & lat
         mxAlgebra(beforeAssort.F %&% postAssortmentCovAll, name="postAssortmentCov", dimnames=list(manifests,manifests)), # Filter out the latent variables
         mxAlgebra(t(beforeAssort.F %*% (solve(I-beforeAssort.A) %*% t(beforeAssort.M))), name="preAssortmentMeans"), # Means (no selection)
         mxExpectationNormal(covariance = "postAssortmentCov", means = "preAssortmentMeans", dimnames = manifests),
         mxFitFunctionML(), 
         mxData(observed=dzData, type="cov", numObs=500, means=dummyMeanVector)
        )

assModel <- mxModel("top", MZGroup, beforeAssort, algebraModelMZ, algebraModelDZ, mxFitFunctionMultigroup(c("AssortMZ","AssortDZ")))
summary(assModelFit <- mxRun(assModel))

#### Try with Rose Fear Data, Factor 8
mzFearData <- matrix(c(
.903, .113, .2547, .1602, 
.113, 1.1053, .0864, .2052, 
.2547, .0864, .9467, .4938, 
.1602, .2052, .4938, .8947), 4, 4, dimnames=list(manifests,manifests))
dzFearData <- matrix(c(
1.0811, .2111, .1443, .1001,
.2111, .9063, .2398, .1551,
.1443, .2398, 1.2079, .3246,
.1001, .1551, .3246, 1.0318
), 4, 4, dimnames=list(manifests,manifests))

assFear <- assModel
assFear$AssortMZ$data <- mxData(observed=mzFearData, type="cov", numObs=144, means=dummyMeanVector)
assFear$AssortDZ$data <- mxData(observed=dzFearData, type="cov", numObs=106, means=dummyMeanVector)

summary(assFearRun <- mxRun(assFear))

# Add in assortment
assFearMu <- assFearRun
assFearMu$AssortMZ$DMatrix$free["MomMZ","DadMZ"] <- T
assFearMu$AssortMZ$DMatrix$labels["MomMZ","DadMZ"] <- "mu"
assFearMu$AssortMZ$DMatrix$free["DadMZ","MomMZ"] <- T
assFearMu$AssortMZ$DMatrix$labels["DadMZ","MomMZ"] <- "mu"
assFearMu$AssortDZ$DMatrix$free["MomMZ","DadMZ"] <- T
assFearMu$AssortDZ$DMatrix$labels["MomMZ","DadMZ"] <- "mu"
assFearMu$AssortDZ$DMatrix$free["DadMZ","MomMZ"] <- T
assFearMu$AssortDZ$DMatrix$labels["DadMZ","MomMZ"] <- "mu"

summary(assFearMuRun <- mxRun(assFearMu))
semPaths(assFearMuRun$MZGroup, layout='spring', what='est', intercepts=F)
tbates commented 3 years ago

umxACEv now supports nSib = 3, with an extra non-twin sib. Expects sib variables to have same naming as twin 1 and 2, but with a "3" suffix. Covariates also supported.