Closed wuhuanon closed 5 months ago
这是设置学习率预热的训练步数占比。使用DeepSpeed-Chat训练代码的话可以参考run_llama2_7b.sh使用参数num_warmup_steps
直接指定预热的步数
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.
Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.
提交前必须检查以下项目
问题类型
其他问题
基础模型
Others
操作系统
Linux
详细描述问题
warmup_rate 参数是干嘛用的,可不可以不用这个参数训练
依赖情况(代码类问题务必提供)
No response
运行日志或截图
No response