OptimalScale / LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
https://optimalscale.github.io/LMFlow/
Apache License 2.0
8.11k stars 819 forks source link

[Feature] PPO Support #851

Closed wheresmyhair closed 2 weeks ago

wheresmyhair commented 3 weeks ago

Description

Add PPO Support

Pipeline Tests

WIP