rlhf Search Results - Githubissues

1000+ results
for rlhf

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/DeepSpeed #4717

[BUG] Failure when trying to use bf16 for RLHF on ROCM -- mi…

**Describe the bug** When using deepspeed-chat RLHF on ROCM/AMD, it crashes if I use bf16 (fp16 works on AMD, both work on NVIDIA). This seems to be because enable_bf16 is never set in op_builder/bui…

bumbawumba updated 10 months ago
1
THUDM/ImageReward #26

How to use HuggingFace Data?

Hi, @xujz18 @Xiao9905 Thanks for this nice contribution. I noticed that we can load ImageReward data with: `datasets.load_dataset("THUDM/ImageRewardDB", "8k")` However, the loaded data seem to…

liming-ai updated 11 months ago
11
microsoft/vscode #215609

vscode hang when try to delete a large directory

Type: Bug I have a directory about 100000 files total 3T size. When I right click it and select "delete permanently", vscode terminal get hang. And later, remote ssh connection is broken. I then r…

ladyrick updated 3 months ago
3
huggingface/datasets #7037

A bug of Dataset.to_json() function

### Describe the bug When using the Dataset.to_json() function, an unexpected error occurs if the parameter is set to lines=False. The stored data should be in the form of a list, but it actually tur…

LinglingGreat updated 2 weeks ago
2
microsoft/DeepSpeedExamples #458

Adding two loss from actor will lead to an error " gradient …

When training the ppo model, I turned on the gradient_checkpointing_enable. If you want to calculate ptx loss, then actor will forward twice. In your code, these two loss are executed backward once se…

piekey1994 updated 6 months ago
4
unslothai/unsloth #725

Does it support rloo_trainer of trl?

> [rank0]: Traceback (most recent call last): > [rank0]: File "/opt/tmp/nlp/wzh/LLM-Dojo/rlhf/rloo_train.py", line 167, in > [rank0]: trainer.train() > [rank0]: File "/home/nlp/miniconda3/…

mst272 updated 1 day ago
7
yangzhipeng1108/DeepSpeed-Chat-ChatGLM #9

AutoModelForCausalLM

AutoModelForCausalLM 中class没有chatglm你是如何解决的呢

Altrouge7 updated 10 months ago
4
Liang-Jiaying/RLAIF #3

Questions to research and think

- [ ] Why the author only compare RLAIF with RLHF on task of summarization? - [ ] How are the performances for other tasks? - [ ] For 4.1 Datasets, what other ways OpenAI use to filter the data? - …

Liang-Jiaying updated 11 months ago
1
karpathy/nanoGPT #41

Is it possible: davinci-003?

Can this approach be used to create a nano-sized `text-davinci-003`?

gameveloster updated 7 months ago
4
CarperAI/trlx #379

Add support XGLM model

### 🚀 The feature, motivation, and pitch I think trlx should support XGLM model for training PPO because XGLM has support 134 languages in [XGLM-4.5B model](https://huggingface.co/facebook/xglm-4.5B)…

tontan1998 updated 1 year ago
2

上一页 1...16 17 18 19 20 21 22...100 下一页

1000+ results for rlhf

1000+ results
for rlhf