eric-mitchell / direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)
Apache License 2.0
2.18k stars 180 forks source link

Pythia2.8B model weights #50

Closed alexv-cerebras closed 1 year ago

alexv-cerebras commented 1 year ago

Hi,

Do you have weights for DPO-Pythia2.8B model fine-tuned on Anthropic-HH dataset?

AmanSinghal927 commented 7 months ago

Hey I was wondering the same thing

AmanSinghal927 commented 7 months ago

@alexv-cerebras