safe-rlhf Search Results

132 results
for safe-rlhf

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ethz-spylab/rlhf-poisoning #7

No module named 'safe_rlhf'

Failed to run the evaluation script.

Oklahomawhore updated 3 months ago
1
ethz-spylab/rlhf-poisoning #8

Evaluation Dataset

Hello, I would like to ask how to create an evaluation dataset. When I directly run `python evaluate_generation_model.py --model_path ../../LLM_Models/poison-7b-SUDO- --token SUDO --report_path ./…

chiayi-hsu updated 2 months ago
5
Vali-98/ChatterUI #84

Allow longer prompt

Despite enabling an 8K context window in ChatterUI, longer prompts are not being forwarded to the local API. This issue suggests a potential limitation within ChatterUI's prompt handling, preventin…

vYLQs6 updated 1 day ago
2
PKU-Alignment/safe-rlhf #181

Failing to train cost model (ValueError: The safer answer is…

### Bug Report I have tried to reproduce the results on my own using Llama 3.1 8b. I can successfully run the SFT and Reward models trainers. But, the cost model trainer consistently crashes. …

cemiu updated 6 days ago
5
huggingface/peft #2054

Problem with model.merge_and_unload - the saved model is al…

### System Info Ubuntu 22.04 all latest versions ### Who can help? @BenjaminBossan @sayakpaul ### Information - [ ] The official example scripts - [x] My own modified scripts ### Ta…

Oxi84 updated 6 days ago
3
PKU-Alignment/safe-rlhf #20

[Feature Request] LoRA support for memory efficient fine-tun…

### Required prerequisites - [X] I have read the documentation . - [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/safe-rlhf/issues) and [Discussions](https://github.com/PKU-…

70557dzqc updated 7 months ago
2
PKU-Alignment/safe-rlhf #161

[Question] GPT-4 and Human Evaluation

### Required prerequisites - [X] I have read the documentation . - [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/safe-rlhf/issues) and [Discussions](https://github.com/PKU-…

gao-xiao-bai updated 2 months ago
1
microsoft/DeepSpeed #6522

[BUG] error ：past_key, past_value = layer_past，how to solve …

**Describe the bug** when i run train，rlhf step 3； ``` Actor_Lr=9.65e-6 Critic_Lr=5e-6 #--data_path Dahoas/rm-static \ #--offload_reference_model \ deepspeed --master_port 12346 main_step3.py…

lovychen updated 4 days ago
1
RLHFlow/RLHF-Reward-Modeling #29

preference dataset 404 not found

> We preprocess many open-source preference datasets into the standard format and upload them to the hugginface hub. You can find them [HERE](https://huggingface.co/collections/RLHFlow/standard-format…

wty500 updated 2 weeks ago
2
AkihikoWatanabe/paper_notes #807

Secrets of RLHF in Large Language Models Part I: PPO, Rui Zh…

# URL - https://arxiv.org/abs/2307.04964 # Affiliations - Rui Zheng, N/A - Shihan Dou, N/A - Songyang Gao, N/A - Wei Shen, N/A - Binghai Wang, N/A - Yan Liu, N/A - Senjie Jin, N/A - Qi…

AkihikoWatanabe updated 11 months ago
2

上一页 1...1 2 3 4 5 6 7...14 下一页

132 results for safe-rlhf

132 results
for safe-rlhf