Hyperparameter tuning: Is it worth it to change the batch size?

tarrade / proj_multilingual_text_classification

Explore multilingal text classification using embedding, bert and deep learning architecture

Apache License 2.0

5 stars 2 forks source link

Hyperparameter tuning: Is it worth it to change the batch size? #54

Closed vluechinger closed 4 years ago

vluechinger commented 4 years ago

A larger batch size will increase the computation cost but will it improve the model? What is the reason why the majority of papers use 32 for training and 64 for validation?

tarrade commented 4 years ago

It seems it matter and depend of the type of model: https://app.wandb.ai/jack-morris/david-vs-goliath/reports/Does-model-size-matter%3F-A-comparison-of-BERT-and-DistilBERT--VmlldzoxMDUxNzU

tarrade commented 4 years ago

Will depend of the type of hardware (memory). We will use the standard batch size for now.