mymusise / ChatGLM-Tuning

基于ChatGLM-6B + LoRA的Fintune方案
MIT License
3.71k stars 443 forks source link

examples/infer_pretrain.ipynb 执行报错 #229

Open 450586509 opened 1 year ago

450586509 commented 1 year ago

in <cell line: 3>:3 │ │ │ │ /usr/local/lib/python3.10/dist-packages/peft/peft_model.py:181 in from_pretrained │ │ │ │ 178 │ │ │ model = cls(model, config, adapter_name) │ │ 179 │ │ else: │ │ 180 │ │ │ model = MODEL_TYPE_TO_PEFT_MODEL_MAPPING[config.task_type](model, config, ad │ │ ❱ 181 │ │ model.load_adapter(model_id, adapter_name, **kwargs) │ │ 182 │ │ return model │ │ 183 │ │ │ 184 │ def _setup_prompt_encoder(self, adapter_name): │ │ │ │ /usr/local/lib/python3.10/dist-packages/peft/peft_model.py:384 in load_adapter │ │ │ │ 381 │ │ │ filename, map_location=torch.device("cuda" if torch.cuda.is_available() else │ │ 382 │ │ ) │ │ 383 │ │ # load the weights into the model │ │ ❱ 384 │ │ set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name) │ │ 385 │ │ if ( │ │ 386 │ │ │ (getattr(self, "hf_device_map", None) is not None) │ │ 387 │ │ │ and (len(set(self.hf_device_map.values()).intersection({"cpu", "disk"})) > 0 │ │ │ │ /usr/local/lib/python3.10/dist-packages/peft/utils/save_and_load.py:123 in │ │ set_peft_model_state_dict │ │ │ │ 120 │ else: │ │ 121 │ │ raise NotImplementedError │ │ 122 │ │ │ ❱ 123 │ model.load_state_dict(peft_model_state_dict, strict=False) │ │ 124 │ if isinstance(config, PromptLearningConfig): │ │ 125 │ │ model.prompt_encoder[adapter_name].embedding.load_state_dict( │ │ 126 │ │ │ {"weight": peft_model_state_dict["prompt_embeddings"]}, strict=True │ │ │ │ /usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:2041 in load_state_dict │ │ │ │ 2038 │ │ │ │ │ │ ', '.join('"{}"'.format(k) for k in missing_keys))) │ │ 2039 │ │ │ │ 2040 │ │ if len(error_msgs) > 0: │ │ ❱ 2041 │ │ │ raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( │ │ 2042 │ │ │ │ │ │ │ self.class.name, "\n\t".join(error_msgs))) │ │ 2043 │ │ return _IncompatibleKeys(missing_keys, unexpected_keys) │ │ 2044 │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM: size mismatch for base_model.model.transformer.layers.0.attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]). size mismatch for base_model.model.transformer.layers.0.attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]). size mismatch for base_model.model.transformer.layers.1.attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]). size mismatch for base_model.model.transformer.layers.1.attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]). size mismatch for base_model.model.transformer.layers.2.attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]). size mismatch for base_model.model.transformer.layers.2.attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]). size mismatch for base_model.model.transformer.layers.3.attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]). size mismatch for base_model.model.transformer.layers.3.attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([8192, 8, 1]) from checkpoint, the shape in current model is torch.Size([12288, 8]). size mismatch for base_model.model.transformer.layers.4.attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([8, 4096]).

Daniel-1997 commented 1 year ago

请问解决了吗?