issues
search
allenai
/
FineGrainedRLHF
Apache License 2.0
255
stars
21
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
concerns about the length of the input
#14
andyclsr
closed
6 months ago
1
Running PPO with a subset of RMs
#13
vishwa27yvs
closed
6 months ago
2
Issue when running train_sft.sh
#12
yunsaijc
closed
9 months ago
1
Factuality
#11
sauc-abadal
closed
11 months ago
0
Weights of reward functions in RLHF
#10
jsw7460
closed
11 months ago
1
运行了源码,没有修改过模型和参数,但是loss和reward的结果很震荡
#9
Congcong-Song
opened
1 year ago
3
计算advantages时lastgaelam是指什么?
#8
Congcong-Song
closed
7 months ago
1
支持多卡并行吗?训练ppo的时候似乎所有的模型都加载在同一块卡上了
#7
Congcong-Song
closed
1 year ago
1
训练好的modeling_output可以提供一下吗?例如偏好模型,奖励模型?
#6
Congcong-Song
closed
1 year ago
2
sft训练时找不到transformers.generation
#5
Congcong-Song
closed
1 year ago
1
Is there any plan to share the pre-trained rewards (R1, R2, R3 and R_pref) on HuggingFace?
#4
ZHZisZZ
closed
1 year ago
0
training files missing for training finegrained reward models
#3
wise-east
closed
1 year ago
2
rename requrements.txt
#2
nishkalavallabhi
closed
1 year ago
1
Open-sourcing the reward models
#1
Glavin001
closed
1 year ago
2