reward-modeling Search Results

609 results
for reward-modeling

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

h2oai/h2ogpt #620

unable to run the app

I get the following error "(h2ogpt) C:\Users\username\h2ogpt>python generate.py --base_model=h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v3 --langchain_mode=UserData --score_model=None --load_4bit=Tr…

bsudhanva updated 1 year ago
21
huggingface/trl #600

DeepSpeed ZeRO-3 throws `RuntimeError: 'weight' must be 2-D`…

I'm trying to run the `sentiment_tuning.py` [example](https://github.com/lvwerra/trl/blob/main/examples/scripts/sentiment_tuning.py) with `accelerate` and DeepSpeed ZeRO-3, but am hitting a runtime er…

lewtun updated 1 year ago
2
hiyouga/LLaMA-Factory #831

ppo阶段报错RuntimeError: Tensors must be CUDA and dense

基座使用baichuan-13b，sft是全参数微调，rm在sft基础上lora微调，ppo启动脚本如下： ``` export CUDA_VISIBLE_DEVICES=1,2,3,4 deepspeed --num_gpus 4 --master_port=9901 src/train_bash.py \ --deepspeed deepspeed_zero3.json…

yuye2133 updated 1 year ago
2
huggingface/trl #401

Reproducing StackLLaMA

I've reproduced the whole StackLLaMA pipeline using the changes in #398 #399 #400 Here is the [corresponding wandb report](https://wandb.ai/mnoukhov/trl/reports/StackLLaMA-Repro--Vmlldzo0NTM1MDk2)…

mnoukhov updated 1 year ago
19
h2oai/h2ogpt #1216

Mixtral in docker

sudo docker run \ --gpus all \ --runtime=nvidia \ --shm-size=2g \ --rm --init \ --network host \ -v /etc/passwd:/etc/passwd:ro \ -v /etc/group:/…

alexg711 updated 9 months ago
11
h2oai/h2o-llmstudio #213

[BUG] Pushing int8 model to Huggingface has error: Attribute…

## How to Reproduce 1. Train 2. Push to Huggingface 3. Error 😢 With: - Docker image: - Runpod A100 80GB Config: ```yaml architecture: backbone_dtype: int8 force_embedding_…

Glavin001 updated 1 year ago
3
WangShuXian6/blog #181

虚幻引擎5 游戏技能系统(GAS) Unreal Engine 5 Gameplay Ability System

# 虚幻引擎5 游戏技能系统(GAS) Unreal Engine 5 Gameplay Ability System https://docs.unrealengine.com/5.3/zh-CN/gameplay-ability-system-for-unreal-engine/

WangShuXian6 updated 5 months ago
29
huggingface/trl #285

RuntimeError: one of the variables needed for gradient compu…

Reproduce: torchrun reward_summarization.py details: You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a metho…

bingjie3216 updated 1 year ago
7
hiyouga/LLaMA-Factory #49

启动cli或者web_demo时如何加载reward和rlhf的checkpoint?

acbogeh updated 1 year ago
3
OptimalScale/LMFlow #377

`RuntimeError: CUDA error: device-side assert triggered ` wh…

Hi, I also met the same issue when I `bash run_finetune_with_lora.sh` with the `LLAMA-7b`. The following is my script and log: ``` #!/bin/bash # Please run this script under ${project…

Ancientshi updated 1 year ago
6

上一页 1...37 38 39 40 41 42 43...61 下一页

609 results for reward-modeling

609 results
for reward-modeling