issues
search
Pillars-Creation
/
ChatGLM-RLHF-LoRA-RM-PPO
ChatGLM-6B添加了RLHF的实现,以及部分核心代码的逐行讲解 ,实例部分是做了个新闻短标题的生成,以及指定context推荐的RLHF的实现
Apache License 2.0
78
stars
8
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[BUG/Help] <title>这里截断输入是不是写错了?
#8
cheng940929
opened
6 months ago
1
[BUG/Help] <title>ModuleNotFoundError: No module named 'datasets'
#7
xihaofei
opened
10 months ago
1
[BUG/Help] 模型好像没有正确加载Lora权重
#6
mockyd
opened
11 months ago
1
[BUG/Help] 'python finetune_ppo.py'启用fp16参数会遇到ValueError: Attempting to unscale FP16 gradients.
#5
mockyd
closed
11 months ago
1
网页部署报错
#4
wuQi-666
opened
11 months ago
0
RM训练过程中数据集的制作
#3
wuQi-666
opened
12 months ago
2
[BUG/Help] 请问作者是在单卡A100 40G显存条件下跑通全部流程的吗?包括后续的PPO阶段(需要同时塞两个模型)
#2
BIT-Xu
opened
1 year ago
4
使用lora微调之后,推理的时候显示??
#1
oppppp
opened
1 year ago
1