IBM / SALMON

Self-Alignment with Principle-Following Reward Models
https://arxiv.org/abs/2310.05910
GNU General Public License v3.0
148 stars 14 forks source link