Open igorbpf opened 7 years ago
Does anyone know how overfitting was dealt? I read something about early stopping at 3.1 section, but I get overfitting at second epoch of training using the hyperparameters specified in the article. Is that correct?
regularization: l2 or dropout or both.
Does anyone know how overfitting was dealt? I read something about early stopping at 3.1 section, but I get overfitting at second epoch of training using the hyperparameters specified in the article. Is that correct?