allenai FineGrainedRLHF issues - Githubissues

allenai / FineGrainedRLHF

Apache License 2.0

255 stars 21 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

concerns about the length of the input

#14 andyclsr closed 6 months ago
1
Running PPO with a subset of RMs

#13 vishwa27yvs closed 6 months ago
2
Issue when running train_sft.sh

#12 yunsaijc closed 9 months ago
1
Factuality

#11 sauc-abadal closed 11 months ago
0
Weights of reward functions in RLHF

#10 jsw7460 closed 11 months ago
1
运行了源码，没有修改过模型和参数，但是loss和reward的结果很震荡

#9 Congcong-Song opened 1 year ago
3
计算advantages时lastgaelam是指什么？

#8 Congcong-Song closed 7 months ago
1
支持多卡并行吗？训练ppo的时候似乎所有的模型都加载在同一块卡上了

#7 Congcong-Song closed 1 year ago
1
训练好的modeling_output可以提供一下吗？例如偏好模型，奖励模型？

#6 Congcong-Song closed 1 year ago
2
sft训练时找不到transformers.generation

#5 Congcong-Song closed 1 year ago
1
Is there any plan to share the pre-trained rewards (R1, R2, R3 and R_pref) on HuggingFace?

#4 ZHZisZZ closed 1 year ago
0
training files missing for training finegrained reward models

#3 wise-east closed 1 year ago
2
rename requrements.txt

#2 nishkalavallabhi closed 1 year ago
1
Open-sourcing the reward models

#1 Glavin001 closed 1 year ago
2