Closed edwindj closed 3 years ago
Hey Mark,
Playing around with simputation, I thought the following functions would be handy:
library(simputation)
na_status
Data mutilation:
dat <- iris dat[1:3,1] <- dat[3:7,2] <- dat[8:10,5] <- NA
na_status gives a quick overview of NA’s and locations in a data.frame:
na_status(dat)
## ## na count: 11 ## columns nNA ## 1 Sepal.Width 5 ## 2 Species 3 ## 3 Sepal.Length 3
It is useful to check the progress of the imputation process.
dat2 <- impute_lm(dat, Sepal.Length ~ Sepal.Width + Species) na_status(dat2)
## ## na count: 9 ## columns nNA ## 1 Sepal.Width 5 ## 2 Species 3 ## 3 Sepal.Length 1
glimpse_na
When using an imputation pipeline, glimpse_na can be handy. It prints na_status but returns the original input: so it can be placed in a pipeline:
library(dplyr) dat_imputed <- dat %>% glimpse_na()
library(magrittr) dat_imputed <- dat %>% impute_lm(Sepal.Length ~ Sepal.Width + Species) %>% glimpse_na()
Ok , still work to do on Sepal.Length
Sepal.Length
dat_imputed <- dat %>% impute_lm(Sepal.Length ~ Sepal.Width + Species) %>% impute_median(Sepal.Length ~ Species) %>% glimpse_na()
## ## na count: 8 ## columns nNA ## 1 Sepal.Width 5 ## 2 Species 3
And finish it off in the next iteration:
dat_imputed <- dat %>% impute_lm(Sepal.Length ~ Sepal.Width + Species) %>% impute_median(Sepal.Length ~ Species) %>% impute_cart(. ~ .) %>% glimpse_na()
## ## No NA's.
We can also peak in to imputation pipeline with %?>%, which effectively inserts a glimpse_na:
%?>%
dat_imputed <- dat %>% impute_lm(Sepal.Length ~ Sepal.Width + Species) %?>% impute_median(Sepal.Length ~ Species) %>% impute_cart(. ~ .) %>% glimpse_na()
## ## na count: 9 ## columns nNA ## 1 Sepal.Width 5 ## 2 Species 3 ## 3 Sepal.Length 1 ## ## No NA's.
awesombalzzz!
Hey Mark,
Playing around with simputation, I thought the following functions would be handy:
na_status
Data mutilation:
na_status
gives a quick overview of NA’s and locations in a data.frame:It is useful to check the progress of the imputation process.
glimpse_na
When using an imputation pipeline,
glimpse_na
can be handy. It printsna_status
but returns the original input: so it can be placed in a pipeline:Ok , still work to do on
Sepal.Length
And finish it off in the next iteration:
We can also peak in to imputation pipeline with
%?>%
, which effectively inserts aglimpse_na
: