As briefly discussed with @lewtun this morning, this PR adds the scripts/run_kto.py script to fine-tune LLMs using the trl.KTOTrainer from the alignment-handbook.
The script should work as is, but still needs to be tested, cc in case you're interested @jiwooya1000 and @nlee-208
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
Description
As briefly discussed with @lewtun this morning, this PR adds the
scripts/run_kto.py
script to fine-tune LLMs using thetrl.KTOTrainer
from thealignment-handbook
.The script should work as is, but still needs to be tested, cc in case you're interested @jiwooya1000 and @nlee-208
The main reference used to put the script together has been https://github.com/huggingface/trl/blob/main/examples/scripts/kto.py.