IBM / SALMON

Self-Alignment with Principle-Following Reward Models
https://arxiv.org/abs/2310.05910
GNU General Public License v3.0
144 stars 12 forks source link