baichuan-inc / Baichuan-7B

A large-scale 7B pretraining language model developed by BaiChuan-Inc.
https://huggingface.co/baichuan-inc/baichuan-7B
Apache License 2.0
5.67k stars 506 forks source link

pretrain learning rate is le-8? #119

Open hegang1-tal opened 1 year ago

hegang1-tal commented 1 year ago

Required prerequisites

Questions

hi, I found in deepspeed config file that the learning rate is le-8. I am wandering that is this too small for pretraining ?

Checklist