wyona / katie-backend

Katie Backend
https://katie.qa
Apache License 2.0
27 stars 2 forks source link

Automatically generate preference dataset #8

Closed michaelwechner closed 8 months ago

michaelwechner commented 8 months ago

In order to fine-tune a LLM with DPO, we need a preference dataset, whereas see for example

https://towardsdatascience.com/fine-tune-a-mistral-7b-model-with-direct-preference-optimization-708042745aac https://huggingface.co/datasets/Anthropic/hh-rlhf?row=0

michaelwechner commented 8 months ago

Branch GH-8_preference_dataset created

michaelwechner commented 8 months ago

Implemented in branch GH-8_preference_dataset and merged with main