Closed vluechinger closed 4 years ago
It seems it matter and depend of the type of model: https://app.wandb.ai/jack-morris/david-vs-goliath/reports/Does-model-size-matter%3F-A-comparison-of-BERT-and-DistilBERT--VmlldzoxMDUxNzU
Will depend of the type of hardware (memory). We will use the standard batch size for now.
A larger batch size will increase the computation cost but will it improve the model? What is the reason why the majority of papers use 32 for training and 64 for validation?