Missing values - Githubissues

krisztianposch commented 8 years ago

Dear medflex Team,

Maybe it is a misunderstanding on my side, but the package does not seem to work with missing data. Using the example dataset:

UPBData_alt <- UPBdata
UPBData_alt[200, "negaff"] <- NA
impData <- neImpute(UPB ~ att + initiator * negaff + gender + educ + age,
                    family = binomial("logit"), nMed=2, data = UPBData_alt)

Which produces the following error message: ´

Warning in cbind(aux0, aux1) :
  number of rows of result is not a multiple of vector length (arg 2)

Could you please help with this and clarify why it doesn't work?

Thank you for your diligent work.

jmpsteen commented 8 years ago

Dear Krisztian,

thanks for your interest in the package! The imputation-based estimator implemented in neImpute only accommodates missingness in the outcome. Missingness in the other variables can be accommodated by multiple imputation. Please consult section 9.2 of the companion vignette of the package: https://cran.r-project.org/web/packages/medflex/vignettes/medflex.pdf

I was wondering which version of medflex you're using since I couldn't replicate the error. Running the code below

UPBData_alt <- UPBdata
UPBData_alt[200, "negaff"] <- NA
impData <- neImpute(UPB ~ att + initiator * negaff + gender + educ + age,
                    family = binomial("logit"), nMed=2, data = UPBData_alt)
neMod <- neModel(UPB ~ att0 * att1 + gender + educ + age, 
                 family = binomial("logit"), expData = impData, se = "r")

returned the error

Error in aggregate.data.frame(as.data.frame(x), ...) : 
  arguments must have same length

which is expected, because neModel runs into trouble when trying to calculate robust standard errors because of missingness in variables different from the outcome (this can be avoided by requesting bootstrap standard errors though).

A multiple imputation analysis can be obtained by running the following code (assuming that this correponds to the final natural effect model that you wish to fit, of course). More details can be found in the vignette in section 9.2.

library("mice")
library("mitools")

multImp <- mice(UPBData_alt, m = 10)
expData <- with(multImp, neImpute(UPB ~ att + initiator * negaff + gender + educ + age,
                                  family = binomial("logit"), nMed = 2))
expData <- imputationList(expData$analyses)
neMod1 <- with(expData, neModel(UPB ~ att0 * att1 + gender + educ + age,
                                family = binomial("logit"), se = "r"))

MIcombine(neMod1)

Hope this helps!

Johan

krisztianposch commented 8 years ago

Dear Johan,

Thank you for your quick response and kind clarification, I read the paper and understood that only the outcome variable will be imputed, but couldn't decode the error message received. It's all clear now.

Thanks again!

jmpsteen / medflex

Missing values #5