twidlr is an R package that exposes a consistent API for model functions and their corresponding predict methods such that they are specified as:
fit <- model(data, formula, ...)
predict(fit, data, ...)
Where "data" is a required data.frame (or able to be coerced to one) and "formula" is a formula (or string able to be coerced to one) that describes the model to be fitted.
twidlr gets its name from the "twiddle" used in R formulas.
twidlr is available to install from github by running:
# install.packages("devtools")
devtools::install_github("drsimonj/twidlr")
library(twidlr)
exposes model functions that you're already familiar with, but such that they accept a data.frame first, formula second, and then additional arguments. A robust method to predict
data is also exposed.
For example, a typical linear model would be lm(hp ~ mpg * wt, mtcars, ...)
. Once twidlr
is loaded, the same model would be run via lm(mtcars, hp ~ mpg * wt, ...)
.
Modelling in R is messy! Some models take formulas and data frames while others require matrices and vectors. The same can be said of corresponding predict()
methods, which can also be impure, returning unexpected or inconsistent results.
twidlr seeks to overcome these problems be providing:
predict
methods (helping to improve the generality of tidy modelling packages like piplearner)predict
being made available for all methods (including unsupervised algorithms like kmeans) and making "data" a required argumentmtcars %>% lm(hp ~ wt)
glmnet(iris, Sepal.Width ~ Petal.Width * Petal.Length + Species)
. Formulas created as strings can always be used too!Model functions exposed by twidlr:
Package | Functions |
---|---|
e1071 | naiveBayes, svm |
gamlss | gamlss |
glmnet | cv.glmnet, glmnet |
lme4 | glmer, lmer |
quantreg | crq, nlrq, rq, rqss |
randomForest | randomForest |
rpart | rpart |
stats | aov, factanal, glm, kmeans, lm, prcomp, t.test (now 'ttest') |
xgboost | xgboost |
For conventions and best-practices when contributing to twidlr, please see CONTRIBUTING.md