shibing624 / MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Apache License 2.0
3.24k stars 492 forks source link

增量预训练,Lora模型和原模型(baichuan2)融合后,推理时间变得很长,大概要好几分钟才会出结果,但是原来的baichuan2只需要几秒钟 #273

Closed sweetboxwwy closed 10 months ago

sweetboxwwy commented 10 months ago

Describe the bug

Please provide a clear and concise description of what the bug is. If applicable, add screenshots to help explain your problem, especially for visualization related problems.

sweetboxwwy commented 10 months ago

测试了不同的数据集在baichuan2训练后的融合模型推理时间,发现 --model_max_length 为1024时推理时间正常,但修改为512训练后,推理时间变长很多