Open kawine opened 9 months ago
@winglian given that follow-up studies have found unpaired KTO has shown to outperform DPO/IPO/CPO on various tasks (https://www.semanticscholar.org/paper/Insights-into-Alignment%3A-Evaluating-DPO-and-its-Saeidi-Verma/db407c3a60c6dc768fde8dd1088dab3be951f04e), would it be possible to add support for it in axlotl?
The TRL implementation of KTO is now stable, in case a reference other than the original REPO (https://github.com/ContextualAI/HALOs) is needed.
β οΈ Please check that this feature request hasn't been suggested before.
π Feature description
The original HALOs repo supports the full version of KTO that allows for imbalanced data and more stable training.
See trainer here: https://github.com/ContextualAI/HALOs/blob/6333a8f03c5c12c0a0b791e083904eda47a5b96c/trainers.py#L758
This library currently only supports KTO training on pairs using the less stable version of the loss, called SimpleKTO in the original repo.
βοΈ Solution
The KTOTrainer in the original repo with the full loss, as described in the attached image (note that there is no backpropagation through the KL term; it is used only for saturation):
β Alternatives
No response
π Additional Context
No response
Acknowledgements