FT with bottleneck : cannot perform fine-tuning on purely quantized models

Lao-yy commented 4 months ago

Hi! I'm tried to finetune llama-2-13b with bottleneck Adapter, but it got a ValueError that cannot finetune the model loading by using load_in8bit. What is the problem? How can I solve it?

ValueError: You cannot perform fine-tuning on purely quantized models. Please attach trainable adapters on top of the quantized model to correctly perform fine-tuning. Please see: https://huggingface.co/docs/transformers/peft for more details

The package versions I'm using are as follows: accelerate 0.27.2 bitsandbytes 0.41.2.post2 black 23.11.0 transformers 4.39.0.dev0 torch 2.1.1 gradio 4.7.1

The peftModel was constructed as follows. I think it was loaded in 8bit correctly.

---------model structure--------- PeftModelForCausalLM( (base_model): BottleneckModel( (model): LlamaForCausalLM( (model): LlamaModel( (embed_tokens): Embedding(32000, 5120) (layers): ModuleList( (0-39): 40 x LlamaDecoderLayer( (self_attn): LlamaSdpaAttention( (q_proj): Linear8bitLt(in_features=5120, out_features=5120, bias=False) (k_proj): Linear8bitLt(in_features=5120, out_features=5120, bias=False) (v_proj): Linear8bitLt(in_features=5120, out_features=5120, bias=False) (o_proj): Linear8bitLt(in_features=5120, out_features=5120, bias=False) (rotary_emb): LlamaRotaryEmbedding() ) (mlp): LlamaMLP( (gate_proj): Linear8bitLt( in_features=5120, out_features=5120, bias=False (adapter_down): Linear(in_features=5120, out_features=256, bias=False) (adapter_up): Linear(in_features=256, out_features=5120, bias=False) (act_fn): Tanh() ) (up_proj): Linear8bitLt( in_features=5120, out_features=5120, bias=False (adapter_down): Linear(in_features=5120, out_features=256, bias=False) (adapter_up): Linear(in_features=256, out_features=5120, bias=False) (act_fn): Tanh() ) (down_proj): Linear8bitLt( in_features=5120, out_features=5120, bias=False (adapter_down): Linear(in_features=5120, out_features=256, bias=False) (adapter_up): Linear(in_features=256, out_features=5120, bias=False) (act_fn): Tanh() ) (act_fn): SiLU() ) (input_layernorm): LlamaRMSNorm() (post_attention_layernorm): LlamaRMSNorm() ) ) (norm): LlamaRMSNorm() ) (lm_head): CastOutputToFloat( (0): Linear(in_features=5120, out_features=32000, bias=False) ) ) ) )

HZQ950419 commented 4 months ago

Hi,

Please refer to #55. Let us know if it is helpful to solve your issue! Thanks!

Lao-yy commented 4 months ago

I tried to install your package and it worked. But it's falied that using Trainer to load_best_model at final. It got the error that version of PEFT (0.3.0) is not compatible to transformer(4.34.1). Then, while I updated the PEFT(0.6.0), it got another error as follows:

ImportError: cannot import name 'inject_adapter_in_model' from 'peft' (/workingdir/peft/llama2-ft/LLM-Adapters/peft/src/peft/init.py)

Do you know anything about this ??

AGI-Edgerunners / LLM-Adapters

FT with bottleneck : cannot perform fine-tuning on purely quantized models #57