RLooTrainer bug when using deepspeed

macheng6 commented 1 week ago

System Info

When using DeepSpeed, the RLOOTrainer reports an error: "ValueError: Please make sure to properly initialize your accelerator via accelerator = Accelerator() before using any functionality from the accelerate library." This is likely due to the accelerate not being properly initialized in line 120 of the RLOOTrainer code, possibly because the deepspeed_plugin was not passed in.

trl: 0.12.0.dev0 transformers: 4.45.2 accelerate: 1.0.1

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[X] An officially supported task in the examples folder
[ ] My own task or dataset (give details below)

Reproduction

accelerate launch --config_file examples/accelerate_configs/deepspeed_zero3.yaml \ examples/scripts/rloo/rloo.py \ --dataset_name trl-internal-testing/descriptiveness-sentiment-trl-style \ --dataset_train_split descriptiveness \ --output_dir models/minimal/rloo \ --rloo_k 2 \ --num_ppo_epochs 1 \ --num_mini_batches 1 \ --learning_rate 3e-6 \ --per_device_train_batch_size 1 \ --gradient_accumulation_steps 16 \ --total_episodes 10000 \ --model_name_or_path EleutherAI/pythia-1b-deduped \ --sft_model_path EleutherAI/pythia-1b-deduped \ --reward_model_path EleutherAI/pythia-1b-deduped \ --local_rollout_forward_batch_size 1 \ --missing_eos_penalty 1.0

Expected behavior

This is likely due to the accelerate not being properly initialized in line 120 of the RLOOTrainer code, possibly because the deepspeed_plugin was not passed in.

naskimed commented 5 days ago

Hey, I have the same issue using PPOTrainer: "ValueError: Please make sure to properly initialize your accelerator via accelerator = Accelerator() before using any functionality from the accelerate library".

trl: 0.13.0.dev0 transformers: 4.46.2 accelerate: 1.1.0.dev0 Screenshot from 2024-11-07 17-24-25

KAKSIS commented 4 days ago

Hey, I have the same issue using PPOTrainer: "ValueError: Please make sure to properly initialize your accelerator via accelerator = Accelerator() before using any functionality from the accelerate library".

trl: 0.13.0.dev0 transformers: 4.46.2 accelerate: 1.1.0.dev0

I have the same problem

kongjiellx commented 4 days ago

+1 with PPOTrainer

macheng6 commented 1 day ago

After using the version configuration below, the code can be run: trl==0.11.4 accelerate==0.33.0，

huggingface / trl