FarrellDay / miceRanger

miceRanger: Fast Imputation with Random Forests in R
Other
67 stars 12 forks source link

How to work with imputed data for subsequent modeling? #8

Closed bbachilles closed 3 years ago

bbachilles commented 3 years ago

Hello,

Thanks so much for you and your team's work on miceRanger. I used PMM to impute (what I think is the) default 5 imputed datasets. I would like to use those 5 imputed datasets to run a subsequent analysis. Typically, I would use something like the mitml package for this by extracting the list of datasets:

library(mitml)

# stacked dataset with a variable, imputation, that identifies the imputed dataset number
implist <-  as.mitml.list(split(JSP_imp2, JSP_imp2$imputation)) 

#specify syntax for a lme4 (lmer) model
m_imp <- "MathTest2 ~ MathTest0 + male + manual + smn_math0 + smn_male + smn_manual + (manual || School)"

#run analysis using lmer
analysis <- with(implist, lmer(m_imp, REML = F))

#get the pooled estimates
estimates <- testEstimates(analysis, var.comp = T, df.com = NULL)

Is this the same procedure to use with a miceRanger object? I cannot seem to get it to work, however. Any help or information about this process would be most appreciated.

Thanks!

samFarrellDay commented 3 years ago

There are no built in functions to pool results. I personally build the process manually using lapply for whatever analysis I am performing to return the results for each dataset as a list.

bbachilles commented 3 years ago

If I could put in a plug for "piping" results into the mice/mitml universe for analysis, that would be great. Thanks!