bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.
https://huggingface.co/docs/bitsandbytes/main/en/index
MIT License
6.24k stars 625 forks source link

Cannot merge LORA layers when the model is loaded in 8-bit mode #430

Closed coolchenshan closed 1 year ago

coolchenshan commented 1 year ago

when running "guanaco_7B_demo_colab.ipynb" i take load_in_4bit=True condition but meet valueError of "Cannot merge LORA layers when the model is loaded in 8-bit mode".

coolchenshan commented 1 year ago

ValueError Traceback (most recent call last) Cell In[1], line 21 14 m = AutoModelForCausalLM.from_pretrained( 15 model_name, 16 load_in_4bit=True, 17 torch_dtype=torch.bfloat16, 18 device_map={"": 0} 19 ) 20 m = PeftModel.from_pretrained(m, adapters_name) ---> 21 m = m.merge_and_unload() 22 tok = LlamaTokenizer.from_pretrained(model_name) 23 tok.bos_token_id = 1

File ~/anaconda3/envs/LLM3/lib/python3.10/site-packages/peft/tuners/lora.py:352, in LoraModel.merge_and_unload(self) 349 raise ValueError("GPT2 models are not supported for merging LORA layers") 351 if getattr(self.model, "is_loaded_in_8bit", False) or getattr(self.model, "is_loaded_in_4bit", False): --> 352 raise ValueError("Cannot merge LORA layers when the model is loaded in 8-bit mode") 354 keylist = [key for key, in self.model.named_modules() if "lora" not in key] 355 for key in key_list:

ValueError: Cannot merge LORA layers when the model is loaded in 8-bit mode

Qubitium commented 1 year ago

@coolchenshan The code is likely commented out for a reason. Try loading it using device_map gpu if you have vram or load the to cpu, and leave 4bit disabled for merging.

https://github.com/artidoro/qlora/blob/e3817441ccc7cff04ca13e8fab5196d453ff32f2/examples/guanaco_7B_demo_colab.ipynb#L4