Closed GasolSun36 closed 1 year ago
I encountered the same issue. Also ds-chat step3 training for Llama-2-7b-hf, when I enabled --print_answers
, I found that the answers were empty strings:
--- prompt --> step=2, rank=2, ['\n\nHuman: Is it hard to become an air traffic controller?\n\nAssistant:']
--- prompt --> step=2, rank=1, ["\n\nHuman: I'd like to give a toast at my Christmas dinner party.\n\nAssistant:"]
--- prompt --> step=2, rank=0, ['\n\nHuman: How do I get a plumbers license.\n\nAssistant:']
--- ans --> step=2, rank=2, ['']
--- ans --> step=2, rank=1, ['']
--- ans --> step=2, rank=0, ['']
And when I printed generated sequence:
# ppo_trainer.py
with torch.no_grad():
seq = self.actor_model.module.generate(
prompts,
attention_mask=mask,
max_length=max_min_length,
pad_token_id=self.tokenizer.pad_token_id,
synced_gpus=self.z3_enabled,
**kwargs)
print(seq)
it's like:
[tensor([[32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000,
32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000,
32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000,
32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000,
32000, 32000, 1, 29871, 13, 13, 29950, 7889, 29901, 6324,
29892, 306, 29915, 29881, 763, 304, 1369, 3704, 6483, 17905,
29872, 289, 1975, 297, 590, 2814, 664, 449, 29889, 1815,
366, 8453, 920, 304, 2189, 445, 15058, 29973, 13, 13,
7900, 22137, 29901, 18585, 29991, 739, 30010, 29879, 2289, 4780,
29889, 2266, 30010, 29879, 263, 9004, 362, 29901, 13, 13,
6730, 29892, 2317, 7812, 411, 596, 6900, 23468, 2920, 12435,
29892, 322, 26681, 29889, 29871, 13, 13, 9190, 29892, 289,
355, 596, 17905, 267, 29892, 24421, 596, 6567, 373, 596,
266, 1141, 29879, 470, 373, 596, 298, 4512, 29889, 13,
13, 10454, 29892, 3965, 596, 540, 1379, 964, 278, 11904,
322, 1369, 304, 7812, 264, 596, 21152, 2745, 366, 508,
29915, 29873, 748, 738, 26645, 29889, 29871, 13, 13, 12881,
635, 29892, 289, 355, 1250, 1623, 964, 278, 6483, 17905,
29872, 289, 355, 2602, 29889, 13, 13, 7058, 29915, 29879,
599, 727, 338, 304, 372, 29991, 29871, 2803, 592, 1073,
565, 366, 505, 738, 5155, 29889, 13, 13, 29950, 7889,
29901, 20419, 306, 505, 777, 5155, 1244, 29889, 887, 2649,
592, 304, 289, 355, 590, 17905, 267, 29892, 322, 769,
7812, 264, 590, 21152, 29889, 1724, 5304, 1546, 1438, 24147,
29892, 920, 1568, 626, 306, 289, 2548, 590, 17905, 267,
29973, 13, 13, 7900, 22137, 29901, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0]], device='cuda:0')]
key env info:
After that, I discovered issue huggingface/transformers#25790, and I attempted to modify the tokenizer's config:
# utils/utils.py
if "llama" in model_name_or_path:
from transformers.models.llama import LlamaTokenizer
tokenizer = LlamaTokenizer.from_pretrained(
model_name_or_path, fast_tokenizer=fast_tokenizer)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.unk_token
# tokenizer.add_special_tokens({'pad_token': '[PAD]'})
tokenizer.padding_side = 'left'
the generated sequence is like:
tensor([[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 29871,
13, 13, 29950, 7889, 29901, 817, 10529, 373, 1209, 573,
17869, 13, 13, 7900, 22137, 29901, 306, 508, 29915, 29873,
2289, 3867, 1906, 1492, 1286, 29889, 1724, 306, 508, 437,
338, 1303, 8277, 322, 19138, 675, 963, 363, 366, 29892,
470, 4511, 366, 411, 385, 4148, 366, 1795, 1284, 8444,
29889, 13, 13, 29950, 7889, 29901, 3431, 825, 8277, 437,
366, 6907, 13, 13, 7900, 22137, 29901, 306, 508, 2367,
366, 263, 1051, 310, 278, 2246, 8277, 297, 1784, 1422,
13997, 29889, 13, 13, 29950, 7889, 29901, 3431, 2649, 592,
901, 13, 13, 7900, 22137, 29901, 306, 508, 2649, 366,
1048, 278, 2246, 8277, 297, 278, 1494, 13997, 29901, 13,
13, 29899, 259, 383, 2463, 13, 29899, 259, 10050, 29899,
29888, 2463, 13, 29899, 259, 15197, 13, 29899, 259, 21782,
29899, 8477, 13, 29899, 259, 15202, 13, 29899, 259, 9327,
13, 29899, 259, 27099, 13, 29899, 259, 5298, 13, 29899,
259, 16407, 29891, 13, 29899, 259, 20986, 29915, 29879, 8277,
13, 29899, 259, 17278, 12733, 13, 29899, 259, 3201, 955,
13, 29899, 259, 8133, 6390, 29879, 13, 29899, 259, 3929,
27184, 13, 29899, 259, 22890, 708, 13, 29899, 259, 6033,
749, 13, 29899, 259, 498, 29878, 5495, 13, 29899, 259,
10443, 16157, 13, 29899, 259, 21782, 29899, 8477, 13, 29899,
259, 21782, 29899, 326, 16123, 882, 13, 29899, 259, 21782,
29899, 8477, 13, 29899, 259, 21782, 29899, 8477, 13, 29899,
259, 21782, 29899, 8477, 13, 29899, 259, 21782, 29899, 8477,
13, 29899, 259, 21782, 29899, 8477, 13, 29899, 259, 21782,
29899, 8477, 13, 29899, 259, 21782, 29899, 8477, 13, 29899,
259, 21782, 29899, 8477, 13, 29899, 259, 21782, 29899, 8477,
13, 29899, 259, 21782, 29899, 8477, 13, 29899, 259, 21782,
29899, 8477, 13, 29899, 259, 21782, 29899, 8477, 13, 29899,
259, 21782, 29899, 8477, 13, 29899, 259, 21782, 29899, 8477,
13, 29899, 259, 21782, 29899, 8477, 13, 29899, 259, 21782,
29899, 8477, 13, 29899, 259, 21782, 29899, 8477, 13, 29899,
259, 21782, 29899, 8477, 13, 29899, 259, 21782, 29899, 8477,
13, 29899, 259, 21782, 29899, 8477, 13, 29899, 259, 21782,
29899, 8477, 13, 29899, 259, 21782, 29899, 8477, 13, 29899,
259, 21782]], device='cuda:0')
tensor([[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 29871, 13, 13, 29950,
7889, 29901, 306, 471, 13858, 14171, 263, 25008, 29889, 5806,
306, 505, 263, 289, 496, 12779, 29915, 29879, 7426, 7743,
29892, 825, 9793, 322, 7794, 575, 292, 526, 5181, 29892,
304, 4953, 263, 1985, 25008, 29973, 13, 13, 7900, 22137,
29901, 1670, 526, 1784, 4072, 310, 4307, 29891, 414, 29889,
1763, 664, 408, 263, 970, 822, 1581, 29892, 470, 297,
278, 4038, 310, 22161, 4307, 29892, 366, 674, 12234, 817,
304, 748, 304, 4307, 3762, 29892, 988, 366, 674, 505,
304, 2125, 4413, 322, 1209, 429, 2232, 29889, 1205, 565,
366, 526, 8852, 297, 5874, 470, 17266, 403, 664, 29892,
372, 1122, 451, 367, 5181, 304, 748, 304, 4307, 3762,
29889, 2860, 366, 10591, 403, 515, 4307, 3762, 29892, 366,
674, 12234, 505, 304, 1209, 263, 7794, 575, 292, 4392,
29889, 13, 13, 29950, 7889, 29901, 1128, 1784, 2440, 947,
263, 3619, 4307, 7426, 2125, 304, 679, 29973, 1126, 338,
372, 15574, 763, 263, 5835, 29915, 29879, 1824, 29892, 925,
263, 2846, 2440, 29892, 470, 763, 385, 5684, 2989, 7426,
29973, 13, 13, 7900, 22137, 29901, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0]], device='cuda:2')
It is still possible that the answer is an empty string, so errors occur.
Maybe the issue lies with the .generate()
method of transformers, and we have to wait for them to fix it by now?
hi, I haved solved the issue, see https://github.com/microsoft/DeepSpeed/issues/4229#issuecomment-1704004959
Describe the bug I trained two LLAMA-2-7B-HF as the actor and critic model in first two steps with DeepSpeed-Chat without any problems. When I was doing deepspeed-chat for step3 RLHF training, the error was reported as this:
IndexError: argmax(): Expected reduction dim 1 to have non-zero size.
I've looked deeper and it looks like it's reporting an error inside the execution of this generate function (in DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py) and that
outputs['logits']
is null (in transformers/generation/utils.py).output of print:
This means that my actor model doesn't actually generate any tokens at all. I see the
prompts
in generate function parameters and this is fine:tensor([[32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 32000, 1, 29871, 13, 13, 29950, 7889, 29901, 817, 10529, 373, 1209, 573, 17869, 13, 13, 7900, 22137, 29901, 306, 508, 29915, 29873, 2289, 3867, 1906, 1492, 1286, 29889, 1724, 306, 508, 437, 338, 1303, 8277, 322, 19138, 675, 963, 363, 366, 29892, 470, 4511, 366, 411, 385, 4148, 366, 1795, 1284, 8444, 29889, 13, 13, 29950, 7889, 29901, 3431, 825, 8277, 437, 366, 6907, 13, 13, 7900, 22137, 29901, 306, 508, 2367, 366, 263, 1051, 310, 278, 2246, 8277, 297, 1784, 1422, 13997, 29889, 13, 13, 29950, 7889, 29901, 3431, 2649, 592, 901, 13, 13, 7900, 22137, 29901]], device='cuda:0')
using LLama-2 tokenizer, we can convert to sentences:
Log output
File "DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py", line 493, in main File "DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py", line 493, in main main() File "DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py", line 493, in main main()out = trainer.generate_experience(batch_prompt['prompt'],
File "DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py", line 493, in main File "DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 124, in generate_experience out = trainer.generate_experience(batch_prompt['prompt'],out = trainer.generate_experience(batch_prompt['prompt'],
File "DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 124, in generate_experience
File "DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 124, in generate_experience seq = self._generate_sequence(prompts, mask, step) File "DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 84, in _generate_sequence out = trainer.generate_experience(batch_prompt['prompt'],seq = self._generate_sequence(prompts, mask, step)
File "DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 124, in generate_experience File "DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 84, in _generate_sequence seq = self._generate_sequence(prompts, mask, step) File "DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 84, in _generate_sequence seq = self._generate_sequence(prompts, mask, step)seq = self.actor_model.module.generate(
File "/data1/sjs/DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 84, in _generate_sequence File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/deepspeed/runtime/hybrid_engine.py", line 253, in generate seq = self._generate_sequence(prompts, mask, step)seq = self.actor_model.module.generate(
File "/data1/sjs/DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 84, in _generate_sequence File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/deepspeed/runtime/hybrid_engine.py", line 253, in generate Traceback (most recent call last): File "/data1/sjs/DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py", line 635, in
seq = self.actor_model.module.generate(
File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/deepspeed/runtime/hybrid_engine.py", line 253, in generate
seq = self.actor_model.module.generate(
File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/deepspeed/runtime/hybrid_engine.py", line 253, in generate
seq = self.actor_model.module.generate(
File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/deepspeed/runtime/hybrid_engine.py", line 253, in generate
generate_ret_vals = self._generate(*inputs, *kwargs)generate_ret_vals = self._generate(inputs, **kwargs)
File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context main() File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
File "/data1/sjs/DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py", line 493, in main generate_ret_vals = self._generate(*inputs, kwargs) File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context generate_ret_vals = self._generate(*inputs, *kwargs) File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context generate_ret_vals = self._generate(inputs, kwargs) File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context out = trainer.generate_experience(batch_prompt['prompt'], File "/data1/sjs/DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 124, in generate_experience seq = self._generate_sequence(prompts, mask, step) File "/data1/sjs/DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 84, in _generate_sequence seq = self.actor_model.module.generate( File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/deepspeed/runtime/hybrid_engine.py", line 253, in generate Traceback (most recent call last): File "/data1/sjs/DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py", line 635, in
generate_ret_vals = self._generate(*inputs, kwargs)
File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, *kwargs)
File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 1596, in generate
return func(args, kwargs)
File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 1596, in generate
return func(*args, kwargs)
File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 1596, in generate
return func(*args, *kwargs)
File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 1596, in generate
return func(args, kwargs)
File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 1596, in generate
return func(*args, **kwargs)
File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 1596, in generate
main()
File "DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py", line 493, in main
out = trainer.generate_experience(batch_prompt['prompt'],
File "/data1/sjs/DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 124, in generate_experience
seq = self._generate_sequence(prompts, mask, step)
File "/data1/sjs/DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 84, in _generate_sequence
return self.greedy_search(seq = self.actor_model.module.generate(
File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 2477, in greedy_search File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/deepspeed/runtime/hybrid_engine.py", line 253, in generate return self.greedy_search( File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 2477, in greedy_search return self.greedy_search(return self.greedy_search(
File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 2477, in greedy_search File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 2477, in greedy_search generate_ret_vals = self._generate(*inputs, *kwargs) File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return self.greedy_search( File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 2477, in greedy_search return func(args, **kwargs) File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 1596, in generate return self.greedy_search( File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 2477, in greedy_search next_tokens = torch.argmax(next_tokens_scores, dim=-1)return self.greedy_search( Traceback (most recent call last):
File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 2477, in greedy_search next_tokens = torch.argmax(next_tokens_scores, dim=-1): argmax(): Expected reduction dim 1 to have non-zero size. next_tokens = torch.argmax(next_tokens_scores, dim=-1)next_tokens = torch.argmax(next_tokens_scores, dim=-1) IndexError : argmax(): Expected reduction dim 1 to have non-zero size. IndexError: IndexErrorargmax(): Expected reduction dim 1 to have non-zero size.: argmax(): Expected reduction dim 1 to have non-zero size.
next_tokens = torch.argmax(next_tokens_scores, dim=-1) IndexError: argmax(): Expected reduction dim 1 to have non-zero size. next_tokens = torch.argmax(next_tokens_scores, dim=-1) IndexError: argmax(): Expected reduction dim 1 to have non-zero size. main() File "/data1/sjs/DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py", line 493, in main next_tokens = torch.argmax(next_tokens_scores, dim=-1) IndexError: argmax(): Expected reduction dim 1 to have non-zero size. out = trainer.generate_experience(batch_prompt['prompt'], File "/data1/sjs/DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 124, in generate_experience seq = self._generate_sequence(prompts, mask, step) File "/data1/sjs/DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 84, in _generate_sequence seq = self.actor_model.module.generate( File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/deepspeed/runtime/hybrid_engine.py", line 253, in generate generate_ret_vals = self._generate(*inputs, *kwargs) File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(args, **kwargs) File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 1596, in generate return self.greedy_search( File "/home/xuchengjin/anaconda3/envs/llm/lib/python3.10/site-packages/transformers/generation/utils.py", line 2477, in greedy_search next_tokens = torch.argmax(next_tokens_scores, dim=-1) IndexError: argmax(): Expected reduction dim 1 to have non-zero size.
To Reproduce
step2 run.sh:
Expected behavior Above.
ds_report output
Screenshots above.
System info (please complete the following information):
Docker context No docker used.
Additional context No.