Closed RylanSchaeffer closed 1 month ago
So I was working with 80gb A100s. I was using 1 GPU for SFT and RM training if I remember correctly, and 4 GPUs for the PPO training. But I know some people have used this code and made it work with smaller/fewer GPUs (for example on 32gb V100s, by reducing batch/chunk size)
Perfect sounds good! Thank you :)
For each of the different stages of the project (SFT, reward model training, PPO), how many GPUs should be used, and with how much VRAM apiece?