cannot export merged model

Hi! I'm trying to merge LLaMA-13B model with a lora tuning I performed thanks to this repo, I get an error about size mismatch Can you please help?

Thank you!

command line: BASE_MODEL=/home/alex/oobabooga/text-generation-webui/models/llama-13b python export_state_dict_checkpoint.py alpacacleaned-13b-loratrained.bin

adapter_config.json:


{
  "base_model_name_or_path": "/home/alex/oobabooga/text-generation-webui/models/llama-13b",
  "bias": "none",
  "enable_lora": null,
  "fan_in_fan_out": false,
  "inference_mode": true,
  "init_lora_weights": true,
  "lora_alpha": 64,
  "lora_dropout": 0.05,
  "merge_weights": false,
  "modules_to_save": null,
  "peft_type": "LORA",
  "r": 32,
  "target_modules": [
    "q_proj",
    "k_proj",
    "v_proj",
    "o_proj"
  ],
  "task_type": "CAUSAL_LM"
}

error output


===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: CUDA runtime path found: /home/alex/oobabooga/installer_files/env/envs/alpaca-lora/lib/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.9
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /home/alex/oobabooga/installer_files/env/envs/alpaca-lora/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so...
Loading checkpoint shards: 100%|██████████| 41/41 [00:36<00:00,  1.13it/s]
Traceback (most recent call last):
  File "/home/alex/oobabooga/alpaca-lora/export_state_dict_checkpoint.py", line 23, in <module>
    lora_model = PeftModel.from_pretrained(
  File "/home/alex/oobabooga/installer_files/env/envs/alpaca-lora/lib/python3.10/site-packages/peft/peft_model.py", line 163, in from_pretrained
    model = set_peft_model_state_dict(model, adapters_weights)
  File "/home/alex/oobabooga/installer_files/env/envs/alpaca-lora/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 74, in set_peft_model_state_dict
    model.load_state_dict(peft_model_state_dict, strict=False)
  File "/home/alex/oobabooga/installer_files/env/envs/alpaca-lora/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
        size mismatch for base_model.model.model.layers.0.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.0.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.0.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.0.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.0.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.0.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.0.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.0.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.1.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.1.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.1.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.1.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.1.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.1.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.1.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.1.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.2.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.2.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.2.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.2.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.2.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.2.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.2.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.2.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.3.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.3.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.3.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.3.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.3.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.3.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.3.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.3.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.4.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.4.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.4.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.4.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.4.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.4.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.4.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.4.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.5.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.5.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.5.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.5.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.5.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.5.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.5.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.5.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.6.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.6.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.6.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.6.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.6.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.6.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.6.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.6.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.7.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.7.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.7.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.7.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.7.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.7.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.7.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.7.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.8.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.8.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.8.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.8.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.8.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.8.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.8.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.8.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.9.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.9.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.9.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.9.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.9.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.9.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.9.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.9.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.10.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.10.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.10.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.10.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.10.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.10.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.10.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.10.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.11.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.11.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.11.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.11.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.11.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.11.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.11.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.11.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.12.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.12.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.12.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.12.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.12.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.12.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.12.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.12.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.13.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.13.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.13.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.13.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.13.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.13.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.13.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.13.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.14.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.14.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.14.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.14.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.14.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.14.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.14.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.14.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.15.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.15.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.15.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.15.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.15.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.15.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.15.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.15.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.16.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.16.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.16.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.16.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.16.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.16.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.16.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.16.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.17.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.17.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.17.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.17.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.17.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.17.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.17.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.17.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.18.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.18.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.18.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.18.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.18.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.18.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.18.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.18.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.19.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.19.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.19.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.19.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.19.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.19.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.19.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.19.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.20.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.20.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.20.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.20.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.20.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.20.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.20.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.20.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.21.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.21.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.21.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.21.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.21.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.21.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.21.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.21.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.22.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.22.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.22.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.22.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.22.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.22.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.22.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.22.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.23.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.23.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.23.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.23.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.23.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.23.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.23.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.23.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.24.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.24.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.24.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.24.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.24.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.24.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.24.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.24.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.25.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.25.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.25.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.25.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.25.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.25.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.25.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.25.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.26.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.26.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.26.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.26.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.26.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.26.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.26.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.26.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.27.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.27.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.27.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.27.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.27.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.27.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.27.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.27.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.28.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.28.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.28.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.28.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.28.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.28.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.28.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.28.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.29.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.29.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.29.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.29.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.29.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.29.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.29.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.29.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.30.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.30.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.30.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.30.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.30.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.30.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.30.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.30.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.31.self_attn.q_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.31.self_attn.q_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.31.self_attn.k_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.31.self_attn.k_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.31.self_attn.v_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.31.self_attn.v_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).
        size mismatch for base_model.model.model.layers.31.self_attn.o_proj.lora_A.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([16, 5120]).
        size mismatch for base_model.model.model.layers.31.self_attn.o_proj.lora_B.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([5120, 16]).

tloen / alpaca-lora

cannot export merged model #272