ChatGLM2 lora finetuning 加载 lora 参数：RuntimeError: Expected 4-dimensional input for 4-dimensional weight [3072, 32, 1, 1], but got 3-dimensional input of size [1, 64, 4096] instead

yilong2001 commented 1 year ago

加载 lora 后的 model 信息：


PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): ChatGLMForConditionalGeneration(
      (transformer): ChatGLMModel(
        (embedding): Embedding(
          (word_embeddings): Embedding(65024, 4096)
        )
        (rotary_pos_emb): RotaryEmbedding()
        (encoder): GLMTransformer(
          (layers): ModuleList(
            (0-27): 28 x GLMBlock(
              (input_layernorm): RMSNorm()
              (self_attention): SelfAttention(
                (query_key_value): MergedLinear(
                  in_features=4096, out_features=4608, bias=True
                  (lora_dropout): Dropout(p=0.05, inplace=False)
                  (lora_A): Linear(in_features=4096, out_features=64, bias=False)
                  (lora_B): Conv1d(64, 3072, kernel_size=(1,), stride=(1,), groups=2, bias=False)
                )
                (core_attention): CoreAttention(
                  (attention_dropout): Dropout(p=0.0, inplace=False)
                )
                (dense): Linear(in_features=4096, out_features=4096, bias=False)
              )
              (post_attention_layernorm): RMSNorm()
              (mlp): MLP(
                (dense_h_to_4h): Linear(in_features=4096, out_features=27392, bias=False)
                (dense_4h_to_h): Linear(in_features=13696, out_features=4096, bias=False)
              )
            )
          )
          (final_layernorm): RMSNorm()
        )
        (output_layer): Linear(in_features=4096, out_features=65024, bias=False)
      )
    )
  )
)

加载方法：


tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_name_or_path, trust_remote_code=True)
model = PeftModel.from_pretrained(model, peft_model_id)

model = model.half().cuda()
model = model.eval()

出现错误，错误行号：


│ /home/beeservice/.conda/envs/pt/lib/python3.10/site-packages/peft/tuners/lora.py:417 in train    │
│                                                                                                  │
│   414 │   │   if not mode and self.merge_weights and not self.merged:                            │
│   415 │   │   │   # Merge the weights and mark it                                                │
│   416 │   │   │   if self.r > 0 and any(self.enable_lora):                                       │
│ ❱ 417 │   │   │   │   delta_w = F.conv1d(                                                        │
│   418 │   │   │   │   │   self.lora_A.weight.data.unsqueeze(0),                                  │
│   419 │   │   │   │   │   self.lora_B.weight.data.unsqueeze(-1),                                 │
│   420 │   │   │   │   │   groups=sum(self.enable_lora),                                          │

yuanzhoulvpi2017 commented 1 year ago

你这个加载方式不对，查看一下我的这个文件https://github.com/yuanzhoulvpi2017/zero_nlp/blob/main/chatglm_v2_6b_lora/infer_lora.ipynb

yilong2001 commented 1 year ago

用这种方式加载，也是一样的问题：

model = AutoModel.from_pretrained(model_name_or_path, trust_remote_code=True, device_map='auto', torch_dtype=torch.bfloat16)

model = PeftModel.from_pretrained(model, peft_model_id)
model = model.eval()

yilong2001 commented 1 year ago

如果这样加载（先做一次 eval）：

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_name_or_path, trust_remote_code=True, device_map='auto', torch_dtype=torch.bfloat16)
model = model.eval()

model = PeftModel.from_pretrained(model, peft_model_id)

在这一步会出现如下问题：

ValueError: We need an `offload_dir` to dispatch this model according to this `device_map`, the following submodules need to be offloaded: base_model.model.transformer.encoder.layers.1,
base_model.model.transformer.encoder.layers.2, base_model.model.transformer.encoder.layers.3, base_model.model.transformer.encoder.layers.4

错误位置：
/home/beeservice/.conda/envs/pt/lib/python3.10/site-packages/peft/peft_model.py:177 in           │
│ from_pretrained                                                                                  │
│                                                                                                  │
│   174 │   │   │   │   device_map = infer_auto_device_map(                                        │
│   175 │   │   │   │   │   model, max_memory=max_memory, no_split_module_classes=no_split_modul   │
│   176 │   │   │   │   )                                                                          │
│ ❱ 177 │   │   │   model = dispatch_model(model, device_map=device_map)                           │
│   178 │   │   │   hook = AlignDevicesHook(io_same_device=True)                                   │
│   179 │   │   │   if model.peft_config.peft_type == PeftType.LORA:                               │
│   180 │   │   │   │   add_hook_to_module(model.base_model.model, hook)                           │

yuanzhoulvpi2017 commented 1 year ago

不知道你是不是用我的代码训练的。也有可能是transformers和peft包的版本问题。建议更新一下试一试。

yuanzhoulvpi2017 / zero_nlp

ChatGLM2 lora finetuning 加载 lora 参数：RuntimeError: Expected 4-dimensional input for 4-dimensional weight [3072, 32, 1, 1], but got 3-dimensional input of size [1, 64, 4096] instead #150