huggingface / trl

Train transformer language models with reinforcement learning.
http://hf.co/docs/trl
Apache License 2.0
10.19k stars 1.29k forks source link

trl dpo AttributeError: 'generator' object has no attribute 'generate' #2292

Closed MonolithFoundation closed 4 weeks ago

MonolithFoundation commented 4 weeks ago

trl dpo AttributeError: 'generator' object has no attribute 'generate'

print('start training...')
    if list(pathlib.Path(training_args.output_dir).glob("checkpoint-*")):
        trainer.train(resume_from_checkpoint=True)
    else:
        trainer.train()
    trainer.save_state()

    model.config.use_cache = True
    if training_args.lora_enable:
        state_dict = get_peft_state_maybe_zero_3(
            model.named_parameters(), training_args.lora_bias
        )
        non_lora_state_dict = get_peft_state_non_lora_maybe_zero_3(
            model.named_parameters()
        )
        if training_args.local_rank == 0 or training_args.local_rank == -1:
            model.config.save_pretrained(training_args.output_dir)
            model.save_pretrained(training_args.output_dir, state_dict=state_dict)
            torch.save(
                non_lora_state_dict,
                os.path.join(training_args.output_dir, "non_lora_trainables.bin"),
            )
        # todo: handle if vision_lora_enable, save vision adapter and llm adapter
    else:
        safe_save_model_for_hf_trainer(
            trainer=trainer, output_dir=training_args.output_dir
        )

the model should be normal, why it keeping prints error:

train_mdpo.py", line 757, in train
    trainer.train()
  File "/data/libs/transformers/src/transformers/trainer.py", line 2122, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/data/libs/transformers/src/transformers/trainer.py", line 2426, in _inner_training_loop
    batch_samples, num_items_in_batch = self.get_batch_samples(epoch_iterator, num_batches)
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/trl/trainer/dpo_trainer.py", line 1508, in get_batch_samples
    policy_output = model.generate(
                    ^^^^^^^^^^^^^^
AttributeError: 'generator' object has no attribute 'generate'
qgallouedec commented 4 weeks ago

Thanks for reporting. Next time, please share your system info (as requested in the contribution guide and in the issue template). It would have been especially relevant here.

You're most likely using Transformers v4.46, which is not compatible with TRL<v0.12 (about to be released). Make sure to downgrade transformers

pip install transformers"<=4.45"

OR

Upgrade to TRL>0.12 (this won't work before the release)

pip install trl">=0.12"

for ref, this issue has been solved in #2246

MonolithFoundation commented 4 weeks ago

Hi, am using transformers 4.47 and trl 0.11.4

Could u indicates me when would 0.12 release and why this error happens for trl 0.12?

monk1337 commented 3 weeks ago

Thanks for reporting. Next time, please share your system info (as requested in the contribution guide and in the issue template). It would have been especially relevant here.

You're most likely using Transformers v4.46, which is not compatible with TRL<v0.12 (about to be released). Make sure to downgrade transformers

pip install transformers"<=4.45"

OR

Upgrade to TRL>0.12 (this won't work before the release)

pip install trl">=0.12"

for ref, this issue has been solved in #2246

Worked for me as well. i was using unsloth and getting this error.

MonolithFoundation commented 3 weeks ago

I still didn't get the root reason for this. the APi changes so rapidly

qgallouedec commented 3 weeks ago

In our trl trainers, we had the following method:

def get_batch_samples(self, model, batch):

However, with the recent addition in Hugging Face Transformers PR #34198, Trainer now includes a new get_batch_samples method:

def get_batch_samples(self, epoch_iterator, num_batches):

This new method has the same name but a different purpose and parameter structure.

Since our trl trainer inherits from the Transformers Trainer class, our original get_batch_samples method in trl is unintentionally overriding the new method in Trainer. This causes a conflict: when self.get_batch_samples(epoch_iterator, num_batches) is called, it actually tries to use our trl method signature (get_batch_samples(model, batch)) instead. This results in the following:

Consequently, when the method tries to execute model.generate(...), it raises an AttributeError because model is now a generator (inherited from epoch_iterator) rather than an expected model with a .generate method. This leads to the error:

policy_output = model.generate(
                ^^^^^^^^^^^^^^
AttributeError: 'generator' object has no attribute 'generate'

To resolve this, we needed to rename the method in #2246

MonolithFoundation commented 3 weeks ago

@qgallouedec So it is! However, after I upgraded trl to master branch, the error still persist why

qgallouedec commented 3 weeks ago

Please share your system info with trl env

maziyarpanahi commented 2 weeks ago

I am getting this error as well and I am also confused with the versions, backward compatibility, and the fix. What is the combination of transformers and trl libraries that resolves this issue? (which versions should we install for these 2 libraries so we don't see the error today)

Installed from the master and it worked. tnx

qgallouedec commented 2 weeks ago
pip install --upgrade trl