tarrade / proj_multilingual_text_classification

Explore multilingal text classification using embedding, bert and deep learning architecture
Apache License 2.0
5 stars 2 forks source link

How to choose the batch size for training and validation datasets? #3

Closed vluechinger closed 4 years ago

vluechinger commented 4 years ago

Batch sizes vary a lot between models. Also, there are sometimes differences in the batch sizes of training and validation.

How sensible are batches of 32 compared to 500?

Reference: https://blog.tensorflow.org/2019/11/hugging-face-state-of-art-natural.html

tarrade commented 4 years ago

could be set as an hyper-parameter. Google did quite some research on this topic and it is summarized below and driven by out of memory constrained: https://github.com/google-research/bert#out-of-memory-issues