THUDM / ImageReward

[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
Apache License 2.0
1.03k stars 54 forks source link

About the reward score #72

Open KN1GHT9 opened 4 months ago

KN1GHT9 commented 4 months ago

In your paper the labeling method seems to be labeled by star ratings, so how is this converted to specific floating point scores when training RM?

xujz18 commented 4 months ago
image

You can refer to Section 2.2 RM Training for details.