TidyPlusR: a tool for data wrangling

The tidyplusR package is an essential data cleaning package with features like missing value treatment, data manipulation and displaying data as markdown table for documents. The package adds a few additional functionality on the existing data wrangling packages in popular statistical software like R. The objective of this package is to provide a few specific functions to solve some of the pressing issues in data cleaning.


You can install tidyplusR from github with:

# install.packages("devtools")

Functions included:

Three main parts include different functions in tidyplusR


This is a basic example which shows you how to solve a common problem:

Data Type Cleansing

The section has two functions, typemix and cleanmix.


typemix(dat) #
## [[1]]
##      x1   x2    x3
## 1     1 test  TRUE
## 2     2 test  TRUE
## 3     3    1 FALSE
## 4 1.2.3 TRUE FALSE
## [[2]]
##          x1        x2 x3
## 1    number character NA
## 2    number character NA
## 3    number    number NA
## 4 character   logical NA
## [[3]]
##   Column_ID number character logical
## 1         1      3         1       0
## 2         2      1         2       1
##     x1   x2    x3
## 1    1 test  TRUE
## 2    2 test  TRUE
## 3    3 <NA> FALSE
## 4 <NA> <NA> FALSE

Missing Value imputation

# Dummy dataframe
dat <- data.frame(x=sample(letters[1:3],20,TRUE), 
                  b = as.logical(sample(0:1,20,TRUE)),

dat[c(5,10,15),1] <- NA
dat[c(3,7),2] <- NA
dat[c(1,3,5),3] <- NA
dat[c(4,5,9),4] <- NA
dat[c(4,5,9),5] <- NA
dat[,4] <- factor(dat[,4] )
dat[c(4,5,9),6] <- NA

# Calling impute function
# method can be replaced by median and mean as well

impute(dat,method = "mode") %>% head()
##   x y    w z     b     a
## 1 a b 34.6 a  TRUE 40.00
## 2 b c 33.0 c FALSE  1.00
## 3 c c 34.6 a  TRUE 38.00
## 4 c c 15.0 a FALSE 23.53
## 5 b b 34.6 a FALSE 23.53
## 6 c c 22.0 b FALSE 37.00

Markdown table

## default: ncol = 2 and nrow = 2, alignment = "l"
## |    |    |
## |:---|:---|
## |    |    |
## |    |    |
## 3 by 3 table
md_new(nrow = 3, ncol = 3)
## |    |    |    |
## |:---|:---|:---|
## |    |    |    |
## |    |    |    |
## |    |    |    |
## different alignments:
md_new(nrow = 1, align = "c")
## |    |    |
## |:--:|:--:|
## |    |    |
md_new(nrow = 1, align = "r")
## |    |    |
## |---:|---:|
## |    |    |
## providing header
h <- c("foo", "boo")
md_new(header = h)
## | foo| boo|
## |:---|:---|
## |    |    |
## |    |    |
md_data(mtcars, row.index = 1:3, col.index = 1:4)
## |    |mpg|cyl|disp|hp|
## |:---|---:|---:|---:|---:|
## |Mazda RX4|21.0|6|160|110|
## |Mazda RX4 Wag|21.0|6|160|110|
## |Datsun 710|22.8|4|108|93|
## alignment to right
md_data(mtcars, row.index = 1:3, col.index = 1:4, align = "r")
## |    |mpg|cyl|disp|hp|
## |:---|---:|---:|---:|---:|
## |Mazda RX4|21.0|6|160|110|
## |Mazda RX4 Wag|21.0|6|160|110|
## |Datsun 710|22.8|4|108|93|
## provide header
md_data(mtcars, row.index = 1:3, col.index = 1:4, header = c("a","b","c","d"))
## |    |a|b|c|d|
## |:---|---:|---:|---:|---:|
## |Mazda RX4|21.0|6|160|110|
## |Mazda RX4 Wag|21.0|6|160|110|
## |Datsun 710|22.8|4|108|93|
## not include row names
md_data(mtcars, row.index = 1:3, col.index = 1:4, row.names = F)
## |mpg|cyl|disp|hp|
## |---:|---:|---:|---:|
## |21|6|160|110|
## |21|6|160|110|
## |22.8|4|108|93|

This is an open source project. Please follow the guidelines below for contribution. - Open an issue for any feedback and suggestions. - For contributing to the project, please refer to Contributing for details.