Open diegofiori opened 1 year ago
OpenAssistant has released on HF the reward models they trained on the open-source datasets. Even if they are not tailored for the user need, we could lavarege them as a starting point for fine-tuning the user reward models.
Available reward models:
Can I work on this?
Please go ahead, let me know if you need any support or if you have any questions. I assigned you to this issue. Thank you! @gagan3012
Description
OpenAssistant has released on HF the reward models they trained on the open-source datasets. Even if they are not tailored for the user need, we could lavarege them as a starting point for fine-tuning the user reward models.
Available reward models:
TODO