Closed harryparr closed 2 years ago
Dear Harry,
thank you for your comment and nice feedback.
By default, JointAI works under the assumption of MAR, i.e., that the missing values come from the same distribution as the observed values, conditional on the other data. The case where the missing observations can only take a subset of the categories in the observed data would likely be MNAR, which requires some additional specifications.
In the case you describe, you could try a pattern mixture model approach by editing the JAGS model JointAI created. This is technically possible with JointAI, but you will introduce additional assumptions by fixing the parameter values.
The easiest way would probably be to include a missingness indicator for the incomplete ordinal variable as an auxiliary variable (argument auxvars
) and to specify that this indicator has a non-proportional effect on the ordinal covariate (argument nonprop
).
Here is a simplified example (you need to work with the GitHub version of JointAI - I just discovered and fixed a bug in the nonprop
argument):
library("JointAI")
# create some extra missing values for this example
wideDF$O2[sample(1:nrow(wideDF), size = 30)] <- NA
# create the missing data indicator
wideDF$O2NA <- is.na(wideDF$O2)
# set-up run without iterations to get the JAGS model
mod0 <- lm_imp(y ~ C1 + B1 + O2, data = wideDF,
auxvars = ~O2NA, nonprop = list(O2 = ~ O2NA),
modelname = "mymodel.R", modeldir = getwd(),
keep_model = TRUE,
n.adapt = 0)
The JAGS model has then been saved as "mymodel.R" which you can edit. Rows 29 - 31 in this example are
eta_O2_1[i] <- M_lvlone[i, 9] * alpha[3]
eta_O2_2[i] <- M_lvlone[i, 9] * alpha[4]
eta_O2_3[i] <- M_lvlone[i, 9] * alpha[5]
This can be changed to
eta_O2_1[i] <- M_lvlone[i, 9] * 10
eta_O2_2[i] <- M_lvlone[i, 9] * 0
eta_O2_3[i] <- M_lvlone[i, 9] * (-10)
The values 10 and -10 are arbitrary and just chosen to be extreme enough to make categories 1 and 4 sufficiently unlikely. When you then save the file as "mymodel_edited.R" you can then run the model as
mod <- lm_imp(y ~ C1 + B1 + O2, data = wideDF,
auxvars = ~O2NA, nonprop = list(O2 = ~ O2NA),
monitor_params = c(imps = TRUE, other_models = TRUE),
modelname = "mymodel_edited.R", modeldir = getwd(),
overwrite = FALSE,
n.iter = 500)
All imputed values are now in categories 2 and 3:
imp <- get_MIdat(mod, m = 30, include = FALSE)
table(category = imp$O2, missing = imp$O2NA)
missing
category FALSE TRUE
1 480 0
2 360 394
3 630 566
4 570 0
I hope this helps. Keep in mind that this is just a technical solution. I haven't thought through the theoretical implications of this.
Best, Nicole
Dear Dr. Erler,
Many thanks for this terrific package.
I wonder if it's possible to impute specific values? For example, a
coxph_imp
class that has some missing independent covariate ordered categoriesclmm
with 4 ordered levels but I'd only want the second & third levels to be imputed, not to impute the reference or the fourth level. Can this be done in the get_MIdat or is it within the _imp function?Many thanks, Harry