missCompare is a missing data imputation pipeline that will guide you through your missing data problem. A range of functions will help you select what could be the most ideal algorithm for your data and provide an easy way to impute missing datapoints in your dataset.
You will find a detailed manual in the missCompare vignette:
install.packages("missCompare")
library(missCompare)
vignette("misscompare")
missCompare::clean()
missCompare::get_data()
missCompare::simulated()
missCompare::all_patterns()
. These patterns are:missCompare::MCAR()
- missing data occurrence randommissCompare::MAR()
- missing data occurrence correlates with other variables' values (univariate solution in missCompare)missCompare::MNAR()
- missing data occurrence correlates with variables' own valuesmissCompare::MAP()
- a combination of the previous three, where the user can define a pattern per variablemissCompare::impute_simulated()
missCompare::impute_data()
missCompare::post_imp_diag()
You can install the released version of missCompare from CRAN with:
install.packages("missCompare")
library(missCompare)
data("clindata_miss")
cleaned <- missCompare::clean(clindata_miss,
var_removal_threshold = 0.5,
ind_removal_threshold = 1,
missingness_coding = -9)
metadata <- missCompare::get_data(cleaned,
matrixplot_sort = T,
plot_transform = T)
missCompare::impute_simulated(rownum = metadata$Rows,
colnum = metadata$Columns,
cormat = metadata$Corr_matrix,
MD_pattern = metadata$MD_Pattern,
NA_fraction = metadata$Fraction_missingness,
min_PDM = 10,
n.iter = 50,
assumed_pattern = NA)
imputed <- missCompare::impute_data(cleaned,
scale = T,
n.iter = 10,
sel_method = c(1:16))
diag <- missCompare::post_imp_diag(cleaned,
imputed$mean_imputation[[1]],
scale=T,
n.boot = 100)
In case you need help or advice on your missing data problem or you need help with the missCompare package, please e-mail the authors. If you would like to report an issue, please do so in a reproducible example at the missCompare GitHub page.