batch size training versus validation

tarrade / proj_multilingual_text_classification

Explore multilingal text classification using embedding, bert and deep learning architecture

Apache License 2.0

5 stars 2 forks source link

Closed tarrade closed 4 years ago

tarrade commented 4 years ago

why should we use different values ?

github-actions[bot] commented 4 years ago

@tarrade Thanks for reporting this issue !

tarrade commented 4 years ago

you don’t compute the gradients for the validation set, so you have more GPU memory available. Usually it’s even bs*2 vor the validation set.