Closed vluechinger closed 4 years ago
Batch sizes vary a lot between models. Also, there are sometimes differences in the batch sizes of training and validation.
How sensible are batches of 32 compared to 500?
Reference: https://blog.tensorflow.org/2019/11/hugging-face-state-of-art-natural.html
could be set as an hyper-parameter. Google did quite some research on this topic and it is summarized below and driven by out of memory constrained: https://github.com/google-research/bert#out-of-memory-issues
Batch sizes vary a lot between models. Also, there are sometimes differences in the batch sizes of training and validation.
How sensible are batches of 32 compared to 500?
Reference: https://blog.tensorflow.org/2019/11/hugging-face-state-of-art-natural.html