shibing624 / MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Apache License 2.0
3.24k stars 492 forks source link

RL强化学习训练后,模型合并时报错,前面的步骤完全按pipeline命令, #251

Closed PICOPON closed 10 months ago

PICOPON commented 11 months ago

4卡机器, 前面的步骤完全按pipeline命令

python merge_peft_adapter.py --model_type bloom \ --base_model_name_or_path merged-sft --peft_model_path outputs-rl-v1 --output_dir merged-rl/

image

shibing624 commented 11 months ago

训练中出错? 可能是显存不足,分到2个卡了,收到改掉device="auto" -》 device="cuda:0"强制分到一个gpu卡跑