eric-mitchell / direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)
Apache License 2.0
2.06k stars 167 forks source link

Implementation for Plackett-Luce rank model #71

Open rohan598 opened 7 months ago

rohan598 commented 7 months ago

@eric-mitchell Will you be adding the implementation for Plackett-Luce rank model in addition to the current Bradley-Terry model?

Looking forward to hearing from you!

jdchang1 commented 5 months ago

@rohan598 I was wondering if you made any headway in this direction? Thanks!