zachmayer / caretEnsemble

caret models all the way down :turtle:
Other
226 stars 75 forks source link

Adding support for custom models #198

Closed eric-czech closed 8 years ago

eric-czech commented 8 years ago

Hi @zachmayer , here is a PR for custom model support with some more docs and at least one decent test. Let me know if you have any suggestions.

Also for the sake of reference, here is some example usage:

library(plyr); library(dplyr)
library(caret); library(caretEnsemble)

# Generate some data to test with
d <- twoClassSim(n=100)
X <- d %>% select(-Class); y <- d$Class

# Create a couple custom models
customRF <- getModelInfo('rf', regex=F)[[1]]
customRF$method <- 'custom.rf'

customGLM <- getModelInfo('rf', regex=F)[[1]]
customGLM$method <- 'custom.glm'

cl <- caretList(
  X, y,
  tuneList=list(
    # The name used internally for this model will come from the "method" attribute above
    caretModelSpec(method=customRF, tuneLength=3),
    # This model on the other hand will be referred to as "myglm" not "custom.glm"
    myglm=caretModelSpec(method=customGLM, tuneLength=1),
    glmnet=caretModelSpec(method='glmnet', tuneLength=5),
    rpart=caretModelSpec(method='rpart', tuneLength=15)
  ),
  trControl=trainControl(method='cv', number=10, classProbs=T)
)

cs <- caretEnsemble(cl)
print(cs)

# A glm ensemble of 4 base models: custom.rf, myglm, glmnet, rpart
# ...

ps I closed the first PR for this after a bad merge on that branch, and had to create this one instead.

lintr-bot commented 8 years ago

tests/testthat/test-ensemble.R:212:3: style: Words within variable and function names should be separated by '_' rather than '.'.

​  X.class.df <- as.data.frame(X.class)
  ^~~~~~~~~~

tests/testthat/test-ensemble.R:215:34: style: Words within variable and function names should be separated by '_' rather than '.'.

​  expect_warning(cl <- caretList(X.class.df, Y.class, tuneList=tune.list, trControl=train.control))
                                 ^~~~~~~~~~

tests/testthat/test-ensemble.R:225:54: style: Words within variable and function names should be separated by '_' rather than '.'.

​  expect_silent(pred.classb <- predict(cs, newdata = X.class.df, type="prob"))
                                                     ^~~~~~~~~~

tests/testthat/test-ensemble.R:226:54: style: Words within variable and function names should be separated by '_' rather than '.'.

​  expect_silent(pred.classc <- predict(cs, newdata = X.class.df[2,], type="prob"))
                                                     ^~~~~~~~~~

tests/testthat/test-ensemble.R:226:67: style: Commas should always have a space after.

​  expect_silent(pred.classc <- predict(cs, newdata = X.class.df[2,], type="prob"))
                                                                  ^
zachmayer commented 8 years ago

Looks good to me. Please fix the lint errors, and I'll do a final review and merge. Thank you!

eric-czech commented 8 years ago

Will do (I'm traveling so sorry for the delay). I don't know why those lint errors aren't all showing up when running the local tests. Oh well I'll get them fixed eventually.

lintr-bot commented 8 years ago

tests/testthat/test-ensemble.R:226:61: style: Commas should always have a space after.

​  expect_silent(pred.classc <- predict(cs, newdata = X.df[2,], type="prob"))
                                                            ^
eric-czech commented 8 years ago

@zachmayer do you have any idea why this build keeps failing? I can't really tell just based on the TravisCI logs. I fixed all the lint errors (and I keep squashing those changes down to keep everything in one commit).

That most recent issue with the build mentioned by lintr-bot was definitely fixed so I'm at a loss.

zachmayer commented 8 years ago

Taking a look at the travis logs, I see one warning and one note:

* checking R code for possible problems ... NOTE
methodCheck: no visible binding for global variable ‘type’

...

* checking Rd \usage sections ... WARNING
Undocumented arguments in documentation object 'validateCustomModel'
  ‘x’
Documented arguments not in \usage in documentation object 'validateCustomModel':
  ‘a’

The warning is causing the test failure— I think you just need to update the @params section of the roxygen comment block above your function.

You can run these checks locally with devtools::test(), which run's all of CRAN's unit tests, on top of the testthat tests.

zachmayer commented 8 years ago

I added some comments that I think will help you fix the problem. Please let me know if you have any other issues, and I'll help you work them out!

coveralls commented 8 years ago

Coverage Status

Changes Unknown when pulling 164c74ee3508999775b1fbec316354f4f8548b1a on eric-czech:adding_custom_models into \ on zachmayer:master**.

eric-czech commented 8 years ago

Beautiful, thanks for the assist!

zachmayer commented 8 years ago

No problem, it's what I'm here for lol!