Dahoas / QDSyntheticData

11 stars 16 forks source link

SALMON: Self-Alignment with Principle-Following Reward Models #73

Open Dahoas opened 6 months ago

alon-albalak commented 5 months ago

I'll take this one