microsoft / DeepSpeedExamples

Example models using DeepSpeed
Apache License 2.0
6.02k stars 1.02k forks source link

Llama2 as actor using zero_stage3 #814

Open George-Chia opened 10 months ago

George-Chia commented 10 months ago

Hello! Did anyone meet the following bug when using zero_stage3 for Lllama2? step3_rlhf_finetuning/rlhf_engine.py:61 in init │ │ │ │ 58 │ │ self.num_total_iters = num_total_iters │ │ 59 │ │ self.tokenizer = tokenizer │ │ 60 │ │ │ │ ❱ 61 │ │ self.actor = self._init_actor(actor_model_name_or_path=actor_model_name_or_path)

AttributeError: 'LlamaAttention' object has no attribute 'rope_theta'.

Note that OPT works, and using zero_stage2 also works.

Jeayea commented 9 months ago

Same error with transformers==0.32.0. After updating transformers==0.34.0, the error is gone. FYI.