unit tests for Optimizers

zachmayer commented 10 years ago

so we know they work!

jknowles commented 10 years ago

@zachmayer This is why we write unit tests. I have found some edge cases where the optimizers fail -- both greedy and safe, for AUC.

These cases seem somewhat degenerate -- i.e. few samples and many columns/models, but we need to put protections in place for users.

Take a look at this and let me know what you think we might want to do in this case.

Here is a MWE:

load(system.file("testdata/stuGradMod.rda",
                 package="caretEnsemble", mustWork=TRUE))

set.seed(3425)

ctrl <- trainControl(method = "cv",
                     number = 5, classProbs = TRUE, savePredictions = TRUE,
                     summaryFunction = twoClassSummary)

out <- caretList(
  x = modeldat2$traindata$preds,
  y = modeldat2$traindata$class,
  trControl = ctrl,
  tuneLength = 3,
  methodList = c("knn", "nb", "lda", "nnet"))

studentEns1 <- caretEnsemble(out, optFUN = safeOptAUC, iter = 500)
studentEns2 <- caretEnsemble(out, optFUN = greedOptAUC, iter = 500)
studentEns3 <- caretEnsemble(out)

The bad part is that the ensembled AUC reported is 0.8304 in all 3 cases and the component models are all hovering at around 0.93. In this case the optimizer should either do better, or it should provide us with an error. Not sure how to handle this.

zachmayer commented 10 years ago

Hmmmm, me neither. What happens after 1 iteration? Do the optimizers at least start with the best single model?

Maybe we start with a warning? If the ensemble is worse than the best single model, we set the weight for that model to 1 and 0 for all other models, with a warning?

jknowles commented 10 years ago

After 1 iteration it has picked the best model:

studentEns1 <- caretEnsemble(out, optFUN = safeOptAUC, iter = 1)
studentEns2 <- caretEnsemble(out, optFUN = greedOptAUC, iter = 1)
> summary(studentEns1)
The following models were ensembled: nnet 
They were weighted: 
1
The resulting AUC is: 0.9749
The fit for each individual model on the AUC is: 
 method    metric   metricSD
   nnet 0.9748845 0.05372301
> summary(studentEns2)
The following models were ensembled: nnet 
They were weighted: 
1
The resulting AUC is: 0.9749
The fit for each individual model on the AUC is: 
 method    metric   metricSD
   nnet 0.9748845 0.05372301

jknowles commented 10 years ago

I will keep testing this. I don't think safeOptAUC is using the right stopping criteria or the criteria is not implemented correctly in the first place -- somehow we should be able to back up when the optimization starts to drop below the single best model.

I'm working on making reproducible cases for the optimization being worse than component models, then I'll work on building in checks to try to remedy these situations.

The way I am inducing this is by using smaller training sets with only 150 - 300 cases. There the cross-validation results and estimates of the metrics for component models are probably less precise, so this makes it harder to optimize beyond the individual models in some instances.

Not sure what to do about that just yet. An option there would be to allow the optimization to be done on another dataset beyond the original training -- though this would require substantial re-working of caretEnsemble and is not suited for 1.0. :speak_no_evil:

zachmayer commented 10 years ago

That's a good start. For 1.0, lets just make sure safeOptAUC stops after 1 iteration in such cases.

jknowles commented 10 years ago

Agreed. It is just going to be hard to figure out what to pass to safeOptAUC to know when we are in a case where we need to stop after 1 iteration. I'll code something up and you can review it and modify if you find a better solution.

zachmayer commented 10 years ago

Yes. Take a shot at it, but for v1, a test for the degenerate case at the end of optimization and a warning is a good start.

jknowles commented 9 years ago

PR #84 closes this issue once merged.

zachmayer / caretEnsemble

unit tests for Optimizers #75