microsoft / DeepSpeedExamples

Example models using DeepSpeed
Apache License 2.0
6.08k stars 1.04k forks source link

Finetuning Bloom model in step 3 failed #451

Closed cokuehuang closed 1 year ago

cokuehuang commented 1 year ago

Actor model: Bloom-1.1b Reward model: Bloom-560m Finetuning cmd: bash training_scripts/single_node/run_bloom_1.1b.sh /DeepSpeedExamples/applications/DeepSpeed-Chat/training/step1_supervised_finetuning/bloom-1.1b/ /DeepSpeedExamples/applications/DeepSpeed-Chat/training/step2_reward_model_finetuning/reward_model/bloom-560m Part of training log:

A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set padding_side='left' when initializing the tokenizer.
------------------------------------
Free memory : 67.217224 (GigaBytes)
Total memory: 79.346863 (GigaBytes)
Requested memory: 0.437500 (GigaBytes)
Setting maximum total tokens (input + output) to 512
WorkSpace: 0x7f8d36000000
------------------------------------
Traceback (most recent call last):
  File "main.py", line 525, in <module>
    main()
  File "main.py", line 434, in main
    out = trainer.generate_experience(prompts)
  File "/DeepSpeedExamples/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 97, in generate_experience
    seq = self._generate_sequence(prompts)
  File "/DeepSpeedExamples/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 73, in _generate_sequence
    seq = self.actor_model.module.generate(prompts,
  File "/opt/conda/envs/deepspeedchat/lib/python3.8/site-packages/deepspeed/runtime/hybrid_engine.py", line 254, in generate
    generate_ret_vals = self._generate(*inputs, **kwargs)
  File "/opt/conda/envs/deepspeedchat/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/transformers/src/transformers/generation/utils.py", line 1513, in generate
    return self.greedy_search(
  File "/transformers/src/transformers/generation/utils.py", line 2330, in greedy_search
    outputs = self(
  File "/opt/conda/envs/deepspeedchat/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/transformers/src/transformers/models/bloom/modeling_bloom.py", line 913, in forward
    transformer_outputs = self.transformer(
  File "/opt/conda/envs/deepspeedchat/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/transformers/src/transformers/models/bloom/modeling_bloom.py", line 786, in forward
    outputs = block(
  File "/opt/conda/envs/deepspeedchat/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/envs/deepspeedchat/lib/python3.8/site-packages/deepspeed/model_implementations/transformers/ds_transformer.py", line 147, in forward
    self.attention(input,
  File "/opt/conda/envs/deepspeedchat/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/envs/deepspeedchat/lib/python3.8/site-packages/deepspeed/ops/transformer/inference/ds_attention.py", line 160, in forward
    context_layer, key_layer, value_layer = self.compute_attention(qkv_out=qkv_out,
  File "/opt/conda/envs/deepspeedchat/lib/python3.8/site-packages/deepspeed/ops/transformer/inference/ds_attention.py", line 253, in compute_attention
    attn_mask=((1 - input_mask).half() * minus_inf),
  File "/opt/conda/envs/deepspeedchat/lib/python3.8/site-packages/torch/_tensor.py", line 40, in wrapped
    return f(*args, **kwargs)
  File "/opt/conda/envs/deepspeedchat/lib/python3.8/site-packages/torch/_tensor.py", line 848, in __rsub__
    return _C._VariableFunctions.rsub(self, other)
RuntimeError: Subtraction, the `-` operator, with a bool tensor is not supported. If you are trying to invert a mask, use the `~` or `logical_not()` operator instead.`

Howerve, change model to opt works well.

evi-Genius commented 1 year ago

same error

lc222 commented 1 year ago

same error

LiinXemmon commented 1 year ago

Same error. Modifying the ds_attention.py brings NoImplementationError.

lc222 commented 1 year ago

similar but not same error。

File "main.py", line 552, in <module> main() File "main.py", line 458, in main 192.18.75.0: out = trainer.generate_experience(prompts) 192.18.75.0: File "/baichuan/haoyu/DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 203, in generate_experience 192.18.75.0: seq = self._generate_sequence(prompts) 192.18.75.0: File "/baichuan/haoyu/DeepSpeedExamples-master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 161, in _generate_sequence 192.18.75.0: seq = self.actor_model.module.generate(prompts, 192.18.75.0: File "/baichuan/anaconda3/envs/deepspeedchat/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context 192.18.75.0: return func(*args, **kwargs) 192.18.75.0: File "/baichuan/anaconda3/envs/deepspeedchat/lib/python3.8/site-packages/transformers/generation/utils.py", line 1513, in generate 192.18.75.0: return self.greedy_search( 192.18.75.0: File "/baichuan/anaconda3/envs/deepspeedchat/lib/python3.8/site-packages/transformers/generation/utils.py", line 2330, in greedy_search 192.18.75.0: outputs = self( 192.18.75.0: File "/baichuan/anaconda3/envs/deepspeedchat/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl 192.18.75.0: result = forward_call(*args, **kwargs) 192.18.75.0: File "/baichuan/anaconda3/envs/deepspeedchat/lib/python3.8/site-packages/transformers/models/bloom/modeling_bloom.py", line 913, in forward 192.18.75.0: transformer_outputs = self.transformer( 192.18.75.0: File "/baichuan/anaconda3/envs/deepspeedchat/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl 192.18.75.0: result = forward_call(*args, **kwargs) 192.18.75.0: File "/baichuan/anaconda3/envs/deepspeedchat/lib/python3.8/site-packages/transformers/models/bloom/modeling_bloom.py", line 730, in forward 192.18.75.0: inputs_embeds = self.word_embeddings(input_ids) 192.18.75.0: File "/baichuan/anaconda3/envs/deepspeedchat/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl 192.18.75.0: result = forward_call(*args, **kwargs) 192.18.75.0: File "/baichuan/anaconda3/envs/deepspeedchat/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 162, in forward 192.18.75.0: return F.embedding( 192.18.75.0: File "/baichuan/anaconda3/envs/deepspeedchat/lib/python3.8/site-packages/torch/nn/functional.py", line 2210, in embedding 192.18.75.0: return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) 192.18.75.0: RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.FloatTensor instead (while checking arguments for embedding)

what should i do to fix this error?

stgzr commented 1 year ago

Any update to this issue?

scarydemon2 commented 1 year ago

same error for actor model :bloomz-7b1 and reward model :opt1.3b

scarydemon2 commented 1 year ago

Same error. Modifying the ds_attention.py brings NoImplementationError.

NoImplementationError is caused by softmaxfunction when config.fp16 is False. Perhaps you've modified fp16 to bf16 that in ds_utils.py according to some issue(same as me). To solve this problem: Change File "/opt/conda/envs/deepspeedchat/lib/python3.8/site-packages/deepspeed/ops/transformer/inference/ds_attention.py", line 253, in compute_attention attn_mask=((1 - input_mask).half() minus_inf), Into attn_mask=((1-input_mask.int()).half() minus_inf), will work for me

scarydemon2 commented 1 year ago

Same error. Modifying the ds_attention.py brings NoImplementationError.

NoImplementationError is caused by softmaxfunction when config.fp16 is False. Perhaps you've modified fp16 to bf16 that in ds_utils.py according to some issue(same as me). To solve this problem: Change File "/opt/conda/envs/deepspeedchat/lib/python3.8/site-packages/deepspeed/ops/transformer/inference/ds_attention.py", line 253, in compute_attention attn_mask=((1 - input_mask).half() minus_inf), Into attn_mask=((1-input_mask.int()).half() minus_inf), will work for me

Not working at all. The padding_side for opt is right, while for bloomz it is left. I tried passing in two different tokenizers, but it caused a lot of conflicts when making the experience.

jeffra commented 1 year ago

Similar issue on DeepSpeed side: https://github.com/microsoft/DeepSpeed/issues/3518

roy-mzh commented 1 year ago

Same error with actor model bloom560m, and critic model opt-350m. Any update?

lekurile commented 1 year ago

Hi @cokuehuang,

Can you please try running this again and include the following PR as well:

I've been able to get this running with the bigscience/bloomz-1b7 BLOOM model:

DeepSpeedExamples/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning$ bash training_scripts/bloom/single_node/run_bloom.sh bigscience/bloomz-1b7 ../step2_reward_model_finetuning/bloom_7b_output/ 3 3 output_bloom7b_actor_hf_critic_step2

Thanks, Lev

lekurile commented 1 year ago

Hi @cokuehuang,

Closing the issue for now since solution was provided. If any issues are still encountered, feel free to open another issue.