Closed jiejie1993 closed 1 year ago
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.
Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.
提交前必须检查以下项目
问题类型
模型训练与精调
基础模型
Chinese-LLaMA-2 (7B/13B)
操作系统
Linux
详细描述问题
run_pt_ddp_upgrade.sh脚本
预训练阶段,使用多机多卡的方式,使用deepspeed-zero2的配置进行训练,硬件是A100,看模型训练完成时的日志是训练了14小时,但实际上消耗了大概约90个小时,请问下,这是因为什么原因?
依赖情况(代码类问题务必提供)
运行日志或截图