SCAR / EGABIcourse19

10 stars 8 forks source link

SUPERLEARNER! #14

Open grwhumphries opened 5 years ago

grwhumphries commented 5 years ago

Highly recommending this great new package:
https://cran.r-project.org/web/packages/SuperLearner/vignettes/Guide-to-SuperLearner.html!! You can run all your great models in one beautiful line of code. Assuming you have a training dataset that looks like this:

# Reduce to a dataset of 150 observations to speed up model fitting.
train_obs = sample(nrow(data), 150)

# X is our training sample.
x_train = data[train_obs, ]

# Create a holdout set for evaluating model performance.
# Note: cross-validation is even better than a single holdout sample.
x_holdout = data[-train_obs, ]

# Create a binary outcome variable: towns in which median home value is > 22,000.
outcome_bin = as.numeric(outcome > 22)

y_train = outcome_bin[train_obs]
y_holdout = outcome_bin[-train_obs]

Then all you have to do to run a simple model is:

sl = SuperLearner(Y = y_train, X = x_train, family = binomial(),method='method.AUC',
                  SL.library = c("SL.mean", "SL.glmnet","SL.ranger"),)
sl