unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
17.49k stars 1.21k forks source link

Gemma 2's 2B LoRA adapter merge is not working #869

Open LostRuins opened 2 months ago

LostRuins commented 2 months ago

1) Load the Gemma2 2B model with Unsloth - OK 2) Perform fine tuning - OK 3) Test the resulting model - OK, responses indicate fine tuning is successful 4) Save 16 bit model.save_pretrained_merged("model", tokenizer, save_method = "merged_16bit",) 5) The model is saved, however, it is identical to the base model, the adapter is not applied at all.

If the lora only is saved separately via model.save_pretrained("lora_model") it gets saved correctly, and loading it with AutoModelForCausalLM.from_pretrained works as expected. However, attempting to merge the lora adapter with the base model e.g. model.merge_and_unload() also fails: The resulting "merged" model is identical to the base model, the adapter is not applied.

Using adapters trained with Huggingface work fine, and can be merged into the base model without issue.

danielhanchen commented 2 months ago

Ok that's weird its not merging correctly? I'll check

LostRuins commented 2 months ago

Thanks! Are you able to repro it?

LostRuins commented 2 months ago

Okay just a quick update: I deleted everything, then downloaded a fresh copy of https://huggingface.co/unsloth/gemma-2-2b , and then downloaded the Unsloth Lora that I made separately, and performed a merge with

base_model = AutoModelForCausalLM.from_pretrained(
        modelpath,
        return_dict=True,
        torch_dtype=torch.float16,
        load_in_4bit=False
)
model = PeftModel.from_pretrained(base_model, lorapath)
model = model.merge_and_unload()

following which the merged model seemed to work correctly, so I don't know if it was a user error in the process.

Nonetheless, you might still want to investigate, since the original model.save_pretrained_merged("model", tokenizer, save_method = "merged_16bit",) direct from colab did not work. But I think this approach is sufficient for me, so feel free to close this issue as resolved if you wish.