OptimalScale / LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
https://optimalscale.github.io/LMFlow/
Apache License 2.0
8.11k stars 819 forks source link

[Feature] Iterative DPO support #859

Closed wheresmyhair closed 1 week ago

wheresmyhair commented 1 week ago

Description

Add Iterative DPO

Pipeline Tests

WIP