rlhf Search Results - Githubissues

1000+ results
for rlhf

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/DeepSpeedExamples #403

Error after changing the model from opt to gpt

I trained the PPO model, use the gpt. I modified the option of model_name_or_path from opt to gpt2 I passed step 1 and step 2,But An error occurred in step 3.The error is as follows: ╭────────────…

lljjgg updated 8 months ago
2
microsoft/DeepSpeed #3672

[BUG] Multi-node failure with Step3 RLHF Training with GPTJ6…

**Describe the bug** I am not able to run the multi-node script for 6B actor and critic on 2 nodes of 8 V100 GPUs on Azure ML. I am running the following command: deepspeed --master_port 29501 ma…

hiteshis updated 1 year ago
3
HumanSignal/label-studio-ml-backend #233

I want to create a RLHF backend/frontend for labelling<=>tra…

If anyone has any lead on this please let me know. also anyone want to collaborate on this direction please let me know.

hemangjoshi37a updated 1 year ago
8
AkihikoWatanabe/paper_notes #894

trl/trlx

trl/trlx: Transformerに基づいたLLMををRLHFできるライブラリ https://github.com/CarperAI/trlx

AkihikoWatanabe updated 11 months ago
2
nebuly-ai/nebuly #224

[Chatllama] Use upvotes in Stanford dataset as a measure for…

# Description Currently we are supporting the following datasets: - [Stanford Human Preferences Dataset (SHP)](https://huggingface.co/datasets/stanfordnlp/SHP) - [Anthropic RLHF](https://huggingf…

diegofiori updated 1 year ago
8
THUDM/VisualGLM-6B #32

modeling_chatglm.py里self.dtype具体是指？

您好，我最近在用visualglm做reward model的训练，在修改和查看代码的时候发现modeling_chatglm.py里有一行：torch_image = torch_image.to(self.dtype).to(self.device)，请问这个self.dtype具体是指？我在代码里没有找到相关的定义

iamsile updated 1 year ago
4
LAION-AI/Open-Assistant #3482

OA Developer Meeting

Last meeting #3321 * spam, bots, and data quality for inference and RLHF * found this old issue #914

AbdBarho updated 1 year ago
3
nebuly-ai/nebuly #256

[Chatllama] Errors when training actor model based on LLaMA-…

root@b787722dc2e1:/workspace/workfile/Projects/chatllama# python artifacts/main.py artifacts/config/config.yaml --type ACTOR Current device used :cuda local_rank: -1 world_size: -1 Traceback (most …

young-chao updated 1 year ago
4
kohya-ss/sd-scripts #575

rl-stablediffusion training

any chance you could implement this? https://github.com/vinhkhuc/ddpo/tree/support_gpu it's for RLHF type of stuff, [check the paper](https://rl-diffusion.github.io/) could be really interesting fo…

nicolai256 updated 1 year ago
2
pandas-dev/pandas #39435

ENH: support reading from several files for read_* functions

#### Is your feature request related to a problem? In general, the implementation of this idea should contribute to simplification of reading functions use and reduce the use of boilerplate code. …

anmyachev updated 4 months ago
1

上一页 1...11 12 13 14 15 16 17...100 下一页

1000+ results for rlhf

1000+ results
for rlhf