Open junshiguo opened 3 years ago
For mini-batch gradient update, we need to shuffle training data, so that actual training inputs for each iteration are different and random. This should benefit model performance given same iterations of training.
Applicable models include NN, LR and WDL.
For mini-batch gradient update, we need to shuffle training data, so that actual training inputs for each iteration are different and random. This should benefit model performance given same iterations of training.
Applicable models include NN, LR and WDL.