PPO training configuration for train_ppo_llama.sh

OpenLLMAI / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

https://openrlhf.readthedocs.io/

Apache License 2.0

1.72k stars 160 forks source link

Closed MurrayTom closed 3 days ago

MurrayTom commented 2 months ago

Hello, I want to run train_ppo_llama.sh on 4 A100 80G. Do I need to reconfigure the GPU allocation of the 4 models?

hijkzzz commented 2 months ago

train_ppo_llama.sh shares gpus for all models, you just need to reconfigure the GPUs in train_ppo_llama_ray.sh