huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
https://huggingface.co/docs/peft
Apache License 2.0
16.46k stars 1.62k forks source link

Why original layer weight is saved for LoRA adapter? #2092

Closed leosongwei closed 1 month ago

leosongwei commented 1 month ago

System Info

peft==0.12.0 transformers==4.44.2 Python 3.11.2 OS: Debian GNU/Linux 12 (bookworm)

Who can help?

No response

Information

Tasks

Reproduction

Note that the Qwen2.5-3B has the config "tie_word_embeddings": true, I guess that caused the trouble?

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model
import safetensors.torch
import torch

model_dir = "Qwen/Qwen2.5-3B"
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True, padding_side='left')
model = AutoModelForCausalLM.from_pretrained(model_dir, torch_dtype=torch.bfloat16, device_map="cuda", trust_remote_code=True, use_safetensors=True).eval()

targets = ["q_proj", "k_proj", "v_proj", "o_proj", "lm_head"]
lora_conf = LoraConfig(
    r=4,
    target_modules=targets,
    lora_dropout=0.1,
    bias='none'
)
lora_model = get_peft_model(model, lora_conf)
lora_model.save_pretrained("lora_save_test")

lora_contents = safetensors.torch.load_file("lora_save_test/adapter_model.safetensors")

for k, p in lora_contents.items():
    print(k)
    print(f"    {p.shape}")

Result:

base_model.model.lm_head.base_layer.weight
    torch.Size([151936, 2048])
...

Clearly, the base model embedding is saved in the LoRA adaptor.

Expected behavior

No base model parameter should be saved within the LoRA adapter's safetensors file.

leosongwei commented 1 month ago

aha, that's from the save_embedding_layers:

import peft
peft.utils.other.EMBEDDING_LAYER_NAMES # ['embed_tokens', 'lm_head']

If set:

lora_model.save_pretrained("/dev/shm/lora_save_test", save_embedding_layers=False)

Then the result is small.