耗时过长 - Githubissues

shibing624 / MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型，实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。

Apache License 2.0

3.37k stars 507 forks source link

Open ucaslei opened 2 months ago

ucaslei commented 2 months ago

Please provide a clear and concise description of what the question is.

用大约2B token数据进行13B模型的增量预训练，训练一个epoch，不使用peft，8个a800，预计耗时400小时，远超出理论时间，可能是什么原因，正常情况下一般多久