OpenLLMAI OpenRLHF issues

OpenLLMAI / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

https://openrlhf.readthedocs.io/

Apache License 2.0

1.71k stars 160 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

PPO采用zero 3 stage后产生time out error

#293 victorShawFan opened 1 month ago
3
启用PPO Ray后无响应

#292 victorShawFan closed 1 month ago
3
RLHF for classification tasks

#291 vinodrajendran001 closed 1 day ago
2
HTTPError when running train_ppo_llama_ray.sh

#290 Zeyuan-Liu closed 1 day ago
5
[question] long context for single model ppo training

#289 yananchen1989 closed 1 month ago
1
RM training loss becomes NAN when finish the first training step.

#288 lixsh6 opened 1 month ago
1
PPO训练之后模型拒绝回答

#287 burger-pb closed 1 day ago
3
Vllm0.42 + Lora configs

#286 hijkzzz closed 1 month ago
0
Custom ExperienceMaker

#285 mgerstgrasser opened 2 months ago
4
when import requests, class NewLineFormatter(logging.Formatter): AttributeError: partially initialized module 'logging' has no attribute 'Formatter' (most likely due to a circular import)

#284 catqaq opened 2 months ago
0
fix vLLM v0.4.1

#283 hijkzzz closed 2 months ago
3
Update NGC and vllm version.

#282 THINK2TRY closed 2 months ago
2
Revert "vllm 0.4.1 compatibility (#278)"

#281 hijkzzz closed 2 months ago
0
fix typos in train_ppo_ray.py

#280 mickelliu closed 2 months ago
1
fix typos in train_ppo_ray.py

#279 mickelliu closed 2 months ago
0
vllm 0.4.1 compatibility

#278 mgerstgrasser closed 2 months ago
6
内存超出问题

#277 burger-pb closed 1 day ago
3
upgrade transformer/deespeed and sync when colocate models

#276 hijkzzz closed 2 months ago
1
CUDA out of memory when i run train_ppo_llama_ray.sh on 4 RTX 4090(24G)

#275 libowen424 closed 2 months ago
2
AssertionError: mismatch size output_state_dict(148) and state_dict(149) sft training

#274 qwenzo closed 2 months ago
3
reward model数据集问题

#273 burger-pb closed 1 day ago
3
PPO training configuration for train_ppo_llama.sh

#272 MurrayTom closed 1 day ago
1
NCCL Broad cast error after first actor fit

#271 karthik-nexusflow closed 2 months ago
17
Issue with models not using `position_ids`

#270 kfertakis opened 2 months ago
1
The configuration for Llama-7b on 4 RTX4090

#269 LinkyLiu opened 2 months ago
5
Inconsistent python version dependency

#268 snailrowen1337 closed 2 months ago
1
add test pipeline: use small LLM and small data

#267 catqaq opened 2 months ago
0
Documentation for using Kuberay

#266 karthik-nexusflow opened 2 months ago
4
vllm / actor broadcast error in multinode training

#265 karthik-nexusflow closed 2 months ago
28
add Knowledge Distillation

#264 ifromeast closed 2 months ago
3
[Baseline] LLaMA2-7B RLHF training curves

#263 hijkzzz opened 2 months ago
2
How long does single LLM's tunning reuqired?

#262 alphahumancoder opened 2 months ago
3
新添中文的READDME.md文件

#261 khazic closed 2 months ago
0
How to get score for a single response from a trained RM

#260 UbeCc closed 2 months ago
1
use custom datasets and cache_dir

#259 UbeCc closed 2 months ago
4
debugging with ray

#258 mickel-liu closed 3 months ago
2
[pre-commit.ci] pre-commit suggestions

#257 pre-commit-ci[bot] closed 3 months ago
0
Is save checkpoint not yet supported for ppo ray trainer?

#256 mickel-liu opened 3 months ago
5
how to train in fp16?

#255 dshnightmare closed 3 months ago
1
Hardware requirement

#254 ridiculouz closed 3 months ago
6
Support ORPO

#253 paulcx opened 3 months ago
1
this repo's hack of rope embedding accepts different input than transformers

#252 babu111 closed 3 months ago
3
[For your information] Ways to build environment and run openrlhf codes on a slurm cluster

#251 glorgao closed 2 months ago
2
add perf and benchmark scripts

#249 wuxibin89 closed 1 month ago
3
fix num_training_steps when micro rollout and train size are not equal

#248 wuxibin89 closed 3 months ago
0
Fixed error due to 'margin' variable type being list in rm_trainer.py

#247 StwayneXG closed 3 months ago
0
Unexpected long actor_time when train_ppo_ray

#246 LSC527 opened 3 months ago
9
enable_ema cause runtime error when running train_ppo_llama.sh

#245 dshnightmare opened 3 months ago
6
Update requirements.txt

#244 kfertakis closed 3 months ago
0
Issues with bulding OpenRLHF locally

#243 kfertakis closed 2 months ago
3

Previous Next