zachmayer / caretEnsemble

caret models all the way down :turtle:
Other
226 stars 75 forks source link

caretStack with mixture of models and features as inputs #163

Open devonbrackbill opened 8 years ago

devonbrackbill commented 8 years ago

Sometimes you might want to run an ensemble where the inputs to the ensemble are predictions from several underlying models AND raw features. For motivation, see the winning solution to the Otto Challenge on Kaggle, where the authors use an ensemble model based on 33 underlying models AND 8 raw features. caretStack() requires a list of caret models. How might we implement this using this package? Is there something straightforward like making 8 underlying models be the identity function to pick up each 8 raw features? Or something similarly obvious I'm missing? Or is it non-trivial to implement?

zachmayer commented 8 years ago

I haven't directly implemented this yet, but you could use caretEnsemble:::makePredObsMatrix and then manually add additional predictors to the datasets.

devonbrackbill commented 8 years ago

Awesome, thanks for the pointer to the correct function! I'll play around with modifying this function.

asadkhanmohmand commented 7 years ago

respected sir, how to find confusion matrix matrix of stack result.

models <- caretList(x1~., data=dnaFrame, trControl=trainControl, methodList=algorithmList) stack.glm <- caretStack(models, method="randomGLM", metric="Accuracy", trControl=trainControl) stack.glm A ctree ensemble of 2 base models: rf, nnet, svmRadial

Ensemble results: Conditional Inference Tree

2426 samples 3 predictor 2 classes: 'no', 'yes'

No pre-processing Resampling: Cross-Validated (5 fold) Summary of sample sizes: 1940, 1941, 1941, 1941, 1941 Resampling results across tuning parameters:

mincriterion Accuracy Kappa
0.01 0.8342989 0.6273422 0.50 0.8586119 0.6902503 0.99 0.8569649 0.6903345

Accuracy was used to select the optimal model using the largest value. The final value used for the model was mincriterion = 0.5.