jeffheaton / t81_558_deep_learning

T81-558: Keras - Applications of Deep Neural Networks @Washington University in St. Louis
https://sites.wustl.edu/jeffheaton/t81-558/
Other
5.72k stars 3.03k forks source link

Move model out of cross vaidation loop #18

Closed thusithathilina closed 6 years ago

thusithathilina commented 6 years ago

IMHO model shouldn't create inside the cross-validation loop. Because we should you use the same model, instead of a different model for each fold. This PR will fix that.

jeffheaton commented 6 years ago

Are you saying let training from the previous fold cross into the current fold?

thusithathilina commented 6 years ago

Yes, my understanding is that. Otherwise aren't we having more than one model? (One model for each fold) I think that is a kind of ensemble model. Maybe I'm wrong 👯‍♂️

jeffheaton commented 6 years ago

Cross validation is more of a error estimation technique than actually generating a model. The idea is that cross validation will tell me how accurate a model will be on new data that I've never seen. So in a 5-fold, you train 5 individual models and then have predictions on the entire data set, which allows you to calculate the error over the entire dataset. This will be a much more accurate estimation of your overall error than a simple training/validation split.

Now to bring it back to an individual model, what I will often do is use the cross validation to also determine the number of steps before the neural network begins to overfit. Use the validation from each fold for early stopping, track how many steps that took. Then average the number of steps for all folds and retrain the network on the entire training set for that number of folds. No early stopping.

If you kept the same neural network for each of the folds, you would be building on the training from each of the previous folds. You would no longer have a separation of training and validation data and your validation results would be artificially good.