OpenLLMAI OpenRLHF issues

OpenLLMAI / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

https://openrlhf.readthedocs.io/

Apache License 2.0

1.72k stars 161 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

enable_ema cause runtime error when running train_ppo_llama.sh

#245 dshnightmare opened 3 months ago
6
Update requirements.txt

#244 kfertakis closed 3 months ago
0
Issues with bulding OpenRLHF locally

#243 kfertakis closed 2 months ago
3
The tokenizer of reward model and policy model.

#242 eyuansu62 opened 3 months ago
2
Fix yi-34b tokenizer, use_fast=False

#241 hijkzzz closed 3 months ago
0
When DPO Yi-34B Assertion `srcIndex < srcSelectDimSize` failed

#240 victorShawFan closed 3 months ago
7
why generate use flash-attn is slower?

#239 dshnightmare opened 3 months ago
2
Forced EOS token in vllm generation?

#238 mgerstgrasser opened 3 months ago
6
Fix #235 mask prompt logits in DPO

#237 hijkzzz closed 4 months ago
0
adding length penalty to reward

#236 karthik-nexusflow opened 4 months ago
1
DPO Loss

#235 paulcx closed 4 months ago
14
cuda.is_available is False in LLMRayActor

#233 THINK2TRY closed 3 months ago
9
Is left-padding in PPO strictly necessary?

#232 mgerstgrasser opened 4 months ago
6
Use existing wandb login if available.

#231 mgerstgrasser closed 4 months ago
1
Actor-Critic-Model

#230 mgerstgrasser opened 4 months ago
5
fix: make vllm lazy import

#229 wuxibin89 closed 4 months ago
0
Compatibility between vllm and NGC

#228 THINK2TRY closed 4 months ago
5
vllm requirement problem

#227 jiashenggu closed 4 months ago
6
OpenRLHF/openrlhf/models/utils.py LlamaRotaryEmbedding is not compatible with transformers 4.38.1

#226 jiashenggu closed 4 months ago
0
Fix tensor shapes in Experience class documentation

#225 Thecats-Jfm closed 4 months ago
0
clarification on config std and mean calculation

#224 karthik-nexusflow closed 4 months ago
3
update support matrix

#223 haicaihi closed 4 months ago
1
vLLM in batch_inference.py

#222 CoeusMaze closed 4 months ago
2
Citation or comparison to trlX and NeMo-align.

#221 LouisCastricato opened 4 months ago
3
Support top models stage2

#220 catqaq opened 4 months ago
0
use_right_pad

#219 hijkzzz closed 4 months ago
1
#217 fix position_ids

#218 hijkzzz closed 4 months ago
1
`position_ids` related PPO bug

#217 tianhao-nexusflow closed 4 months ago
2
support input_key and output_key in datasets

#216 hijkzzz closed 4 months ago
0
fix: adjust vllm monkey patch for vllm>=0.2.7

#215 wuxibin89 closed 4 months ago
0
fix: monkey patch vllm with different versions

#214 wuxibin89 closed 4 months ago
0
fix: ignore non-persistent named buffer when save model

#213 wuxibin89 closed 4 months ago
0
error with saving checkoint with Mistral model

#212 karthik19967829 closed 4 months ago
10
vllm +zero2 hangs

#211 karthik19967829 opened 4 months ago
32
Loading a reward model causes ValueError: weight is on the meta device, we need a `value` to put in on 0

#209 NZ99 opened 4 months ago
19
Got stuck when using PyTorch extensions root during multi-slurm node SFT and cannot continue

#208 Dear-Sloth closed 5 months ago
1
Fix get_strategy

#207 kajyuuen closed 5 months ago
0
fix gradient_checkpointing_kwargs bug

#206 wwxFromTju closed 5 months ago
0
Improve ease of use

#205 hijkzzz opened 5 months ago
1
feat: support Input template

#203 hijkzzz closed 5 months ago
0
Update dataset to support user input template

#202 rbao2018 closed 5 months ago
1
Implement KTO into OpenRLHF

#201 Dylancer1998 closed 5 months ago
1
Enable overlap_comm for better performance

#200 li-plus closed 5 months ago
4
Change run mode so that it could be ran directly in shell.

#199 jovany-wang closed 5 months ago
0
Workers (tasks / actors) killed due to memory pressure (OOM)

#198 LSC527 closed 5 months ago
4
fix: ray actor and critic arribute error

#197 wuxibin89 closed 5 months ago
0
Why is dschf defined in function scope?

#196 kajyuuen closed 5 months ago
1
Bug: 'ActorModelRayActor' object has no attribute 'actor'

#195 hmzo closed 5 months ago
1
reward数据准备的一个细节问题

#194 tonylin52 closed 5 months ago
2
question about support matrix

#193 paulcx closed 4 months ago
21

Previous Next