How to choose the batch size for training and validation datasets?

tarrade / proj_multilingual_text_classification

Explore multilingal text classification using embedding, bert and deep learning architecture

Apache License 2.0

5 stars 2 forks source link

Closed vluechinger closed 4 years ago

vluechinger commented 4 years ago

Batch sizes vary a lot between models. Also, there are sometimes differences in the batch sizes of training and validation.

How sensible are batches of 32 compared to 500?

tarrade commented 4 years ago

could be set as an hyper-parameter. Google did quite some research on this topic and it is summarized below and driven by out of memory constrained: https://github.com/google-research/bert#out-of-memory-issues