Pythia2.8B model weights

eric-mitchell / direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Apache License 2.0

2.18k stars 180 forks source link

Closed alexv-cerebras closed 1 year ago

alexv-cerebras commented 1 year ago

Hi,

Do you have weights for DPO-Pythia2.8B model fine-tuned on Anthropic-HH dataset?

AmanSinghal927 commented 7 months ago

Hey I was wondering the same thing

AmanSinghal927 commented 7 months ago

@alexv-cerebras