OptimalScale / LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
https://optimalscale.github.io/LMFlow/
Apache License 2.0
8.22k stars 821 forks source link

Reward modeling support #836

Closed wheresmyhair closed 4 months ago

wheresmyhair commented 4 months ago

[Ready for review] Reward modeling support Tested on:

  1. Full finetuning full

  2. LoRA lora

  3. LISA lisa

research4pan commented 4 months ago

Several additional fixes in this PR: