huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
https://huggingface.co/docs/peft
Apache License 2.0
16.49k stars 1.63k forks source link

Unobvious behavior of model after applying merge_and_unload #2032

Closed MezentsevIlya closed 3 months ago

MezentsevIlya commented 3 months ago

System Info

peft == 0.12.0 transformers==4.44.0 accelerate==0.33.0 Python 3.9.0

Who can help?

No response

Information

Tasks

Reproduction

I trained mistralai/Mistral-7B-Instruct-v0.1 using the LoRA adapter. The model was loaded in 4bit (using unsloth). Then I wanted to merge the adapter into the model weights and save the result as a standalone model. To do this, I loaded the model, loaded the adapter and did merge_and_unload. But I had to try hard to get the desired result.

model = AutoModelForCausalLM.from_pretrained(model_name)
model = PeftModel.from_pretrained(model, adapter_path)
merged_model = model.merge_and_unload()
merged_model.save_pretrained(results_path)

It turned out that:

  1. It is important to load the model without specifying quantization (this is logical, because the model weights will be quantized, but the adapter weights are not)
  2. It is important to do assignment of the result of executing merge_and_unload. After merging the model model has the type peft.peft_model.PeftModelForCausalLM, which is why only the adapter is saved, which is not there, since the weights are merged, so the emptiness in adapter.safetensors is saved. Because of this, when trying to read the merged model using from_pretrained, we get the base model (since its name/path is specified in adapter_config) without adapters. And the merged_model model has the type transformers.models.mistral.modeling_mistral.MistralForCausalLM and is saved correctly, it can be loaded using from_pretrained and used

The second point worries me. In my opinion, this is not an obvious point. I managed to figure it out only after 1.5 days, having conducted several experiments.

Expected behavior

I am not sure, but in my opinion it will be more clear if after applying merge_and_unload to model object this object became saveable as standalone model.

BenjaminBossan commented 3 months ago
  1. It is important to load the model without specifying quantization (this is logical, because the model weights will be quantized, but the adapter weights are not)

This depends. With bitsandbytes, we support merging into the quantized weights. Or do I misunderstand what you mean?

2. It is important to do assignment of the result of executing merge_and_unload.

it will be more clear if after applying merge_and_unload to model object this object became saveable as standalone model.

Unfortunately, this is not possible. What you're effectively asking is that after calling merge_and_unload, the model variable is transformed into a normal, non-PEFT, transformers model. In abstract Python code, that would be:

class Foo:
    ...

class Bar:
    ...

foo = Foo()
foo.make_bar()
assert type(foo)  == Bar

There is no way to change the type of an assigned variable through a method call (except for extreme hacks).

I searched the code base of PEFT and if I haven't missed anything, all examples and docs show that merge_and_unload() should assign a new variable. Is there some place where you learned about this method that is missing this? Please let me know so that I can improve the docs. But I'm afraid that apart from docs, there is not more that can be done.

MezentsevIlya commented 3 months ago

This depends. With bitsandbytes, we support merging into the quantized weights.

Sure, you are right. I just didn't know about it.

I searched the code base of PEFT and if I haven't missed anything, all examples and docs show that merge_and_unload() should assign a new variable. Is there some place where you learned about this method that is missing this? Please let me know so that I can improve the docs. But I'm afraid that apart from docs, there is not more that can be done.

Hm... yes, now when I understand how it works I searched examples and docs again I see it. I saw it before not probably I wasn't attentive enough to notice this detail

Thank you for your response

BenjaminBossan commented 3 months ago

All right. I'll close the issue then, but feel free to re-open if something new comes up or if you have an idea how we can improve the docs. The experience you had is certainly very frustrating, so if we can help to avoid in the future, we'll do our best.