rlhf Search Results - Githubissues

1000+ results
for rlhf

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

saied71/LLM-Finetuning #3

DPO

saied71 updated 6 months ago
1
microsoft/DeepSpeedExamples #525

[bug]AttributeError: 'DeepSpeedHybridEngine' object has no a…

my training environment is a docker image pulled from `deepspeed/deepspeed:v072_torch112_cu117` and i run it with `docker run -it --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --…

qingchu123 updated 3 months ago
4
AkihikoWatanabe/paper_notes #807

Secrets of RLHF in Large Language Models Part I: PPO, Rui Zh…

# URL - https://arxiv.org/abs/2307.04964 # Affiliations - Rui Zheng, N/A - Shihan Dou, N/A - Songyang Gao, N/A - Wei Shen, N/A - Binghai Wang, N/A - Yan Liu, N/A - Senjie Jin, N/A - Qi…

AkihikoWatanabe updated 8 months ago
2
OpenLMLab/MOSS-RLHF #37

Why are you not releasing your reward model for english?

AmanSinghal927 updated 6 months ago
1
Xwin-LM/Xwin-LM #14

Missing source code, model card, data sheet &c

For a project that "aims to develop and open-source alignment technologies for large language models" the source & all other aspects are remarkably closed. At [opening-up-chatgpt.github.io](https://op…

mdingemanse updated 8 months ago
2
OpenBMB/UltraFeedback #2

奖励模型和批评模型的相关问题？

你好，看了数据集都是英文的，请问用英文训练的奖励模型是批评模型是否能用于中文呢？后续是否会开源中文的RLHF数据集？

liumingzhu6060 updated 8 months ago
1
karpathy/nanoGPT #41

Is it possible: davinci-003?

Can this approach be used to create a nano-sized `text-davinci-003`?

gameveloster updated 4 months ago
4
microsoft/DeepSpeed #4717

[BUG] Failure when trying to use bf16 for RLHF on ROCM -- mi…

**Describe the bug** When using deepspeed-chat RLHF on ROCM/AMD, it crashes if I use bf16 (fp16 works on AMD, both work on NVIDIA). This seems to be because enable_bf16 is never set in op_builder/bui…

bumbawumba updated 7 months ago
1
embeddings-benchmark/mteb #920

Add Voyage multilingual datasets

Lots of multilingual datasets listed here https://docs.google.com/spreadsheets/d/1qf0iYejG-9RgEEi13qB_SK_178-eNaeJDmSDNSj260A/edit?gid=1875159366#gid=1875159366 from https://blog.voyageai.com/2024/06/…

Muennighoff updated 2 weeks ago
1
microsoft/DeepSpeed #3232

[BUG] trainning [ERROR] [launch.py:434:sigkill_handler] exi…

[2023-04-14 13:11:27,879] [INFO] [launch.py:428:sigkill_handler] Killing subprocess 13266 [2023-04-14 13:11:27,885] [ERROR] [launch.py:434:sigkill_handler] ['/usr/bin/python3', '-u', 'main.py', '--lo…

le153234 updated 9 months ago
14

上一页 1...13 14 15 16 17 18 19...100 下一页

1000+ results for rlhf

1000+ results
for rlhf