safe-rlhf Search Results

PKU-Alignment/safe-rlhf #184

[BUG] Llama-3.2 DeepSpeed configuration

### Required prerequisites - [X] I have read the documentation . - [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/safe-rlhf/issues) and [Discussions](https://github.com/PKU-…

AAAhWei updated 1 day ago

ethz-spylab/rlhf-poisoning #7

No module named 'safe_rlhf'

Failed to run the evaluation script.

Oklahomawhore updated 5 months ago

PKU-Alignment/safe-rlhf #183

[Question]

### Required prerequisites - [X] I have read the documentation . - [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/safe-rlhf/issues) and [Discussions](https://github.com/P…

cyzhh updated 4 days ago

ethz-spylab/rlhf-poisoning #9

code understanding

I would now like to be able to read your code and make changes, any suggested ideas, can you say what the classes defined in safe-rlhf mean? such as AutoModelForScore, PreferenceDataset. What's more, …

hanbaoergogo updated 2 months ago

ethz-spylab/rlhf-poisoning #8

Evaluation Dataset

Hello, I would like to ask how to create an evaluation dataset. When I directly run `python evaluate_generation_model.py --model_path ../../LLM_Models/poison-7b-SUDO- --token SUDO --report_path ./…

chiayi-hsu updated 4 months ago

Azure/PyRIT #400

DOC combine datasets notebooks into one

#### Describe the issue linked to the documentation We started with one notebook per dataset (in the doc/code/orchestrators directory) and now it's becoming a lot. Since they all pretty much fo…

romanlutz updated 2 weeks ago

PKU-Alignment/safe-rlhf #20

microsoft/DeepSpeed #6522

[BUG] error ：past_key, past_value = layer_past，how to solve …

**Describe the bug** when i run train，rlhf step 3； ``` Actor_Lr=9.65e-6 Critic_Lr=5e-6 #--data_path Dahoas/rm-static \ #--offload_reference_model \ deepspeed --master_port 12346 main_step3.py…

lovychen updated 1 month ago

145 results
for safe-rlhf

[BUG] Llama-3.2 DeepSpeed configuration

No module named 'safe_rlhf'

[Question]

code understanding

Evaluation Dataset

DOC combine datasets notebooks into one

[Feature Request] LoRA support for memory efficient fine-tun…

[Question] GPT-4 and Human Evaluation

Secrets of RLHF in Large Language Models Part I: PPO, Rui Zh…

[BUG] error ：past_key, past_value = layer_past，how to solve …

145 results for safe-rlhf

145 results
for safe-rlhf