Open aishu194 opened 1 year ago
I encountered the same error. Using python -m pdb , I investigated the tensor's shape at runtime.
It had the right 2-D shape initially:
(Pdb) up
> ..../pytorch1.13.1/lib/python3.9/site-packages/trl/trainer/ppo_trainer.py(454)generate()
-> response = self.accelerator.unwrap_model(self.model).generate(
(Pdb) p query_tensor
tensor([[ 1, 12027, 7420, 278, 2224, 3021, 952, 29875, 3002, 310,
379, 5667, 29914, 29909, 1367, 29903, 29889, 4121, 993, 13676,
17091, 5065, 3381, 322, 521, 342, 6788, 363, 278, 4940,
4723, 29889]], device='cuda:0')
(Pdb) p query_tensor.shape
torch.Size([1, 32])
But the code at line 455 added a new dimension by using the following statement input_ids=query_tensor.unsqueeze(dim=0)
As a result, when the code reaches
pytorch1.13.1/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 623 batch_size, seq_length = input_ids.shape
(Pdb) p input_ids.shape
torch.Size([1, 1, 32])
The assignment tries to assign a 3-D shape into two variables, triggering the error:
ValueError: too many values to unpack (expected 2)
This seems to be a bug in the package.
Somebody reported a similar problem and a solution: https://stackoverflow.com/questions/67193312/huggingface-transformers-returning-valueerror-too-many-values-to-unpack-expec
essentially, the code needs to ignore the first dimension by using something like
fakevar1, batch_size, seq_length = input_ids.shape
I suggest adding "ValueError: too many values to unpack (expected 2)" into the issue title so others can easily find this error.
Thank you for the clear-cut amazing video tutorial and repo. I have been working on this repo and faced the following issue on 8 GPU A100 with OS disk space of 100GB and 5TB external. Could you kindly help me with this!!
Traceback (most recent call last): File "rl_finetuning.py", line 175, in
response_tensor = ppo_trainer.generate(query_tensor, pad_token_id=tokenizer.eos_token_id, max_new_tokens=20)
File "/data-mount/trl/trl/trainer/ppo_trainer.py", line 450, in generate
response = self.accelerator.unwrap_model(self.model).generate(
File "/data-mount/trl/trl/models/modeling_value_head.py", line 198, in generate
return self.pretrained_model.generate(args, kwargs)
File "/home/aishu/.local/lib/python3.8/site-packages/peft/peft_model.py", line 977, in generate
outputs = self.base_model.generate(kwargs)
File "/home/aishu/.local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(args, kwargs) File "/home/aishu/.local/lib/python3.8/site-packages/transformers/generation/utils.py", line 1642, in generate return self.sample( File "/home/aishu/.local/lib/python3.8/site-packages/transformers/generation/utils.py", line 2724, in sample outputs = self( File "/home/aishu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/home/aishu/.local/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 809, in forward outputs = self.model( File "/home/aishu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/home/aishu/.local/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 628, in forward batch_size, seq_length = input_ids.shape ValueError: too many values to unpack (expected 2)