axolotl-ai-cloud / axolotl

Go ahead and axolotl questions
https://axolotl-ai-cloud.github.io/axolotl/
Apache License 2.0
7.87k stars 866 forks source link

KTO (unpaired) Support #1200

Open kawine opened 9 months ago

kawine commented 9 months ago

⚠️ Please check that this feature request hasn't been suggested before.

πŸ”– Feature description

The original HALOs repo supports the full version of KTO that allows for imbalanced data and more stable training.

See trainer here: https://github.com/ContextualAI/HALOs/blob/6333a8f03c5c12c0a0b791e083904eda47a5b96c/trainers.py#L758

This library currently only supports KTO training on pairs using the less stable version of the loss, called SimpleKTO in the original repo.

βœ”οΈ Solution

The KTOTrainer in the original repo with the full loss, as described in the attached image (note that there is no backpropagation through the KL term; it is used only for saturation):

Screen Shot 2024-01-24 at 8 34 44 PM

❓ Alternatives

No response

πŸ“ Additional Context

No response

Acknowledgements

kawine commented 5 months ago

@winglian given that follow-up studies have found unpaired KTO has shown to outperform DPO/IPO/CPO on various tasks (https://www.semanticscholar.org/paper/Insights-into-Alignment%3A-Evaluating-DPO-and-its-Saeidi-Verma/db407c3a60c6dc768fde8dd1088dab3be951f04e), would it be possible to add support for it in axlotl?

The TRL implementation of KTO is now stable, in case a reference other than the original REPO (https://github.com/ContextualAI/HALOs) is needed.