Suggestion on the configurations

OpenLLMAI / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

https://openrlhf.readthedocs.io/

Apache License 2.0

1.71k stars 160 forks source link

Suggestion on the configurations #304

Open Ricardokevins opened 1 month ago

Ricardokevins commented 1 month ago

Hello, great job and very neat code and design!

I would like to inquire if there are more detailed recommendations for the design of rollout batch, train batch, and the number of nodes for each component (actor, etc.). Particularly, a brief introduction to the meanings of these hyperparameters and the impact of these settings on performance and computational efficiency.

hijkzzz commented 1 month ago

Thank you for your suggestions. rollout_batch_size is the PPO experience_replay_buffer size number of nodes is the machine nodes for each model train_batch is the train batch size micro train batch is the batch size per GPU(larger better)