zachmayer / caretEnsemble

caret models all the way down :turtle:
Other
226 stars 75 forks source link

Pass hyperparameters to a caretList in addition to random search #196

Open JasonCEC opened 8 years ago

JasonCEC commented 8 years ago

Hi caretEnsemble team!

This is a feature request I would be happy to help build if pointed in the right direction.

I do batch retraining for ~20 ensemble models once a week using random hyperparameter search; the retraining does not always improve the model, and thus, I would like a way to include the optimal hyperparameters from the last training in the retraining of each model.

For example; rf only has hyperparameter C, and a random search might select: 5, 22, 37, 100, 1241. If rf performed better last week with C = 80, updating the model through random search will actually degrade the performance even though we've spent computational time on a another search.

An easy solution to this is a method to include a list of the last best hyperparameter for each model, and include those parameters in addition to the random search.

I suspect this might require changes to caret as well... how should I go about implementing this?

Cheers!

zachmayer commented 8 years ago

I'm not super familiar with how caret does random search, but I have 2 ideas:

  1. Add the previous best parameters to the tuneGrid parameter for caret::train
  2. Write a simple caret::train wrapper, that fits 2 models using the same CV folds, 1 with the previous best parameters and trControl=trainControl(method="none") and another using random search.

More generally, I could see adding this functionality to caret. Basically do a grid search AND do a random search. This guarantees you test the params in the grid, but also lets the search look outside of that grid. Call it "augmented randoms search" or something.