增量预训练，Lora模型和原模型（baichuan2）融合后，推理时间变得很长，大概要好几分钟才会出结果，但是原来的baichuan2只需要几秒钟

shibing624 / MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型，实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。

Apache License 2.0

3.24k stars 492 forks source link

增量预训练，Lora模型和原模型（baichuan2）融合后，推理时间变得很长，大概要好几分钟才会出结果，但是原来的baichuan2只需要几秒钟 #273

Closed sweetboxwwy closed 10 months ago

sweetboxwwy commented 10 months ago

Describe the bug

Please provide a clear and concise description of what the bug is. If applicable, add screenshots to help explain your problem, especially for visualization related problems.

sweetboxwwy commented 10 months ago

测试了不同的数据集在baichuan2训练后的融合模型推理时间，发现 --model_max_length 为1024时推理时间正常，但修改为512训练后，推理时间变长很多