eric-mitchell / direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)
Apache License 2.0
2.18k stars 180 forks source link

Initial commit src/data.py #79

Closed lesnikow closed 6 months ago

lesnikow commented 6 months ago

Motivation is to get reddit tldr, steinnon human prefernce data, and imdb datasets programmatically.