IBM SALMON issues - Githubissues

IBM / SALMON

Self-Alignment with Principle-Following Reward Models

https://arxiv.org/abs/2310.05910

GNU General Public License v3.0

148 stars 14 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

In which Training step do you use HH-RLHF and SHP datasets?

#5 richhh520 opened 4 months ago
1
A question about the paper

#4 richhh520 opened 4 months ago
1
fix RewardModel forward bug

#3 UbeCc opened 8 months ago
0
Dataset: upload preference dataset

#2 Dada-Cloudzxy opened 10 months ago
0
Fix typo in README.md

#1 eltociear closed 1 year ago
0