l294265421 / alpaca-rlhf

Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
https://88aeeb3aef5040507e.gradio.live/
MIT License
103 stars 13 forks source link

训练问题 #15

Open wanghao-007 opened 10 months ago

wanghao-007 commented 10 months ago

请问模型加载时,做模型并行化操作吗?我发现我直接在deepspeed-chat中跑7B的模型都会爆显存,显卡是A100 80G。