Closed thusithathilina closed 6 years ago
Are you saying let training from the previous fold cross into the current fold?
Yes, my understanding is that. Otherwise aren't we having more than one model? (One model for each fold) I think that is a kind of ensemble model. Maybe I'm wrong 👯♂️
Cross validation is more of a error estimation technique than actually generating a model. The idea is that cross validation will tell me how accurate a model will be on new data that I've never seen. So in a 5-fold, you train 5 individual models and then have predictions on the entire data set, which allows you to calculate the error over the entire dataset. This will be a much more accurate estimation of your overall error than a simple training/validation split.
Now to bring it back to an individual model, what I will often do is use the cross validation to also determine the number of steps before the neural network begins to overfit. Use the validation from each fold for early stopping, track how many steps that took. Then average the number of steps for all folds and retrain the network on the entire training set for that number of folds. No early stopping.
If you kept the same neural network for each of the folds, you would be building on the training from each of the previous folds. You would no longer have a separation of training and validation data and your validation results would be artificially good.
IMHO model shouldn't create inside the cross-validation loop. Because we should you use the same model, instead of a different model for each fold. This PR will fix that.