huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences
https://huggingface.co/HuggingFaceH4
Apache License 2.0
4.2k stars 357 forks source link

DPO/IPO/KTO ablations #104

Closed edbeeching closed 5 months ago

edbeeching commented 5 months ago

Adds readme and configs for the scans in the blogpost. I still need to validate that everything runs on the cluster.

HuggingFaceDocBuilderDev commented 5 months ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.