unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
18.53k stars 1.3k forks source link

How to train only last few layers using FastLanguageModel #1320

Open gneeraj97 opened 3 days ago

gneeraj97 commented 3 days ago

I am trying to fine tune a model using FastLanguageModel, and I only want to train the last few layers. When I am passing the target module/layer, it is throwing error saying that it only allows "accepted_modules". The only accepted modules are : ["q_proj", "k_proj", "v_proj", "o_proj","gate_proj", "up_proj", "down_proj"].

Is there anyway I can train only last few layers using unsloth?

danielhanchen commented 3 days ago

Yes I think you can use layers_to_transform https://huggingface.co/docs/peft/main/en/conceptual_guides/lora#common-lora-parameters-in-peft

gneeraj97 commented 2 days ago

When I do that, unsloth library throws these warning- Not an error, but Unsloth cannot patch MLP layers with our manual autograd engine since either LoRA adapters are not enabled or a bias term (like in Qwen) is used. Not an error, but Unsloth cannot patch Attention layers with our manual autograd engine since either LoRA adapters are not enabled or a bias term (like in Qwen) is used. Not an error, but Unsloth cannot patch O projection layer with our manual autograd engine since either LoRA adapters are not enabled or a bias term (like in Qwen) is used. Unsloth 2024.7 patched 32 layers with 4 QKV layers, 4 O layers and 4 MLP layers.

Is it okay to go ahead and fine tune the model ? What kind of effect would this warning have on training speed, model performance, etc ?