stanford-crfm / mistral

Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging Face 🤗 Transformers.
Apache License 2.0
562 stars 49 forks source link

set train_batch_size to auto #101

Closed J38 closed 3 years ago

J38 commented 3 years ago

It is necessary to set "train_batch_size" to "auto" when running training with deepspeed. This pull request updates a few of the deepspeed config files.