issues
search
OpenLLMAI
/
OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
https://openrlhf.readthedocs.io/
Apache License 2.0
1.73k
stars
164
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Why is dschf defined in function scope?
#196
kajyuuen
closed
5 months ago
1
Bug: 'ActorModelRayActor' object has no attribute 'actor'
#195
hmzo
closed
5 months ago
1
reward数据准备的一个细节问题
#194
tonylin52
closed
5 months ago
2
question about support matrix
#193
paulcx
closed
4 months ago
21
可以把reward model和reference model也剥离出去成vllm服务吗
#192
HuangLK
closed
5 months ago
3
Enable deepspeed.zero.Init causes very strange spikes in PPO policy_loss
#191
wuxibin89
closed
4 months ago
2
support Mixture-of-Experts (MoE) model training
#190
ifromeast
closed
5 months ago
1
Support LoRA and QLoRA (nf4), fix mixtral loss
#189
hijkzzz
closed
6 months ago
0
About using vLLM for generation
#188
LSC527
opened
6 months ago
5
Actor killed after training one episode.
#187
zhanghaoie
closed
5 months ago
9
fix: initialize value_head for ZeRO-3 reward model training
#186
wuxibin89
closed
6 months ago
0
fix: assign GPU for LLMActor if vllm_tensor_parallel_size=1
#185
wuxibin89
closed
6 months ago
0
What version of vllm should we use?
#184
LSC527
closed
6 months ago
3
support mixtral 8*7b balancing loss
#183
hijkzzz
closed
6 months ago
0
fix: add unk_token when vllm generate eos_token only
#182
wuxibin89
closed
6 months ago
0
why not include eos_token in action_seq, which may make mistakes?
#181
ZiyiLiubird
closed
6 months ago
16
Some weights of LLMForSequenceRegression were not initialized.
#180
eyuansu62
closed
6 months ago
4
add official doc
#179
catqaq
closed
6 months ago
0
Support pipeline module such as LLaMA2Pipeline and InstructGPTPipeline
#178
catqaq
opened
6 months ago
0
[bug]: AttributeError: 'dict' object has no attribute 'mean'
#177
eyuansu62
closed
6 months ago
1
Why not mask unk_token when actor generation?
#176
ZiyiLiubird
closed
6 months ago
4
Bug?
#175
ZiyiLiubird
closed
6 months ago
2
upgrade container for to_bettertransformer
#174
hijkzzz
closed
6 months ago
0
[pre-commit.ci] pre-commit suggestions
#173
pre-commit-ci[bot]
closed
6 months ago
0
feat: dynamic construct transformer for sequence classification
#172
wuxibin89
closed
6 months ago
0
High Memory Usage
#171
zhanghaoie
closed
6 months ago
5
feat: enable transformers deepspeed ZeRO-3 integration
#170
wuxibin89
closed
6 months ago
0
refactor ds config and fix flash_attn/ model.config.pad_token_id
#169
hijkzzz
closed
6 months ago
0
update Logo
#168
hijkzzz
closed
6 months ago
0
Logo
#167
hijkzzz
closed
6 months ago
0
Fix flash attention option
#166
li-plus
closed
6 months ago
5
Optimize padding removal
#165
li-plus
closed
6 months ago
2
remove pad token and embedding resize for llama
#163
hijkzzz
closed
6 months ago
0
Optimize reward score gather/scatter
#162
li-plus
closed
6 months ago
2
Support KTO
#161
hijkzzz
closed
5 months ago
2
feat: add vLLM for text generation
#160
wuxibin89
closed
6 months ago
0
baichuan2-13b-base作为actor RuntimeError: CUDA error: device-side assert triggered
#159
netrookiecn
closed
6 months ago
5
为什么速度回比deepspeed chat快4倍这么多
#158
tingshua-yts
closed
6 months ago
4
Loading RM ckpt bug: AttributeError: 'NoneType' object has no attribute 'load'
#157
pikaqqqqqq
closed
6 months ago
3
问一下,huggingface提供的checkpoint为pt文件,如何转成大模型常见的.bin
#156
zjintheroom
closed
6 months ago
4
Add pipeline module to support more scientific comparative experiments and research
#155
catqaq
opened
7 months ago
0
opt local file support
#154
catqaq
closed
7 months ago
0
Repair parquet file reading issue in utils.py
#153
tsaoyu
closed
7 months ago
1
Optimize generation post processing
#152
li-plus
closed
6 months ago
4
feature: add api support for hosting a reward model
#151
ftmtk
opened
7 months ago
5
Discussion on our 1st release.
#150
jovany-wang
closed
3 months ago
1
[Severity] High similarity with Colossal-AI
#149
binmakeswell
closed
7 months ago
4
Inquiry regarding the feasibility of fine-tuning LLaMA2-7B with a single A100
#148
callanwu
closed
7 months ago
4
fix: ensure all experience has been send to critic before training
#147
wuxibin89
closed
7 months ago
0
support Qwen
#146
pikaqqqqqq
closed
7 months ago
1
Previous
Next