Need to work out how to link the shadow matrix to this, but I haven't seen anyone store multiple imputations as a nested list, which seems like a really natural way to store the data, rather than as some sort of messy list structure.
library(mice)
library(tidyverse)
#> Loading tidyverse: tibble
#> Loading tidyverse: tidyr
#> Loading tidyverse: readr
#> Loading tidyverse: purrr
#> Loading tidyverse: dplyr
#> Conflicts with tidy packages ----------------------------------------------
#> complete(): tidyr, mice
#> filter(): dplyr, stats
#> is_null(): purrr, testthat
#> lag(): dplyr, stats
#> matches(): dplyr, testthat
imp <- mice(nhanes)
#>
#> iter imp variable
#> 1 1 bmi hyp chl
#> 1 2 bmi hyp chl
#> 1 3 bmi hyp chl
#> 1 4 bmi hyp chl
#> 1 5 bmi hyp chl
#> 2 1 bmi hyp chl
#> 2 2 bmi hyp chl
#> 2 3 bmi hyp chl
#> 2 4 bmi hyp chl
#> 2 5 bmi hyp chl
#> 3 1 bmi hyp chl
#> 3 2 bmi hyp chl
#> 3 3 bmi hyp chl
#> 3 4 bmi hyp chl
#> 3 5 bmi hyp chl
#> 4 1 bmi hyp chl
#> 4 2 bmi hyp chl
#> 4 3 bmi hyp chl
#> 4 4 bmi hyp chl
#> 4 5 bmi hyp chl
#> 5 1 bmi hyp chl
#> 5 2 bmi hyp chl
#> 5 3 bmi hyp chl
#> 5 4 bmi hyp chl
#> 5 5 bmi hyp chl
# get the number of imputations used and store
vars <- c("age", "bmi")
m_imp <- imp$m
# make a list to contain all of the imputed dataframes (m times)
dat.mi.list <- list("vector", m_imp)
# now, go through 1...m times and do the following
for (i in (1:m_imp)){
# set the data to be
dat.mi.list[[i]] <-
# the i-th completed dataset from multiple imputation
complete(imp, i) %>%
# then subset the data based upon the variables specified
dplyr::select(dplyr::one_of(vars)) %>%
# then make a column called `m`, and make this a factor
dplyr::mutate(m = as.factor(i))
}
# length(dat.mi.list)
data.imputed.melt <- do.call("rbind", dat.mi.list)
bound_imputed <- dat.mi.list %>%
dplyr::bind_rows() %>%
tibble::as_tibble() %>%
dplyr::group_by(m) %>%
tidyr::nest()
#> Warning in bind_rows_(x, .id): Unequal factor levels: coercing to character
bound_imputed
#> # A tibble: 5 × 2
#> m data
#> <chr> <list>
#> 1 1 <tibble [25 × 2]>
#> 2 2 <tibble [25 × 2]>
#> 3 3 <tibble [25 × 2]>
#> 4 4 <tibble [25 × 2]>
#> 5 5 <tibble [25 × 2]>
aside from removing the loop to do mice::complete, there needs to be some sort of way to link these data back to nhanes, to help us identify which pieces were imputed.
Need to work out how to link the shadow matrix to this, but I haven't seen anyone store multiple imputations as a nested list, which seems like a really natural way to store the data, rather than as some sort of messy list structure.
One approach, taken from
neato::imputation_plot
aside from removing the loop to do mice::complete, there needs to be some sort of way to link these data back to nhanes, to help us identify which pieces were imputed.