OpenLLMAI / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
https://openrlhf.readthedocs.io/
Apache License 2.0
1.73k stars 164 forks source link

Fix tensor shapes in Experience class documentation #225

Closed Thecats-Jfm closed 4 months ago

Thecats-Jfm commented 4 months ago

Corrected the shape documentation for the values, returns, and advantages tensors within the Experience class. Previously, these tensors were incorrectly documented as having shape (B), implying only batch dimensionality. However, they actually have shapes (B, A), where "B" is the batch size and "A" is the number of actions, to accurately reflect the data structure for each instance in a batch. This change ensures the documentation accurately matches the data model's design, enhancing clarity and developer understanding.