brevdev / notebooks

Collection of notebook guides created by the Brev.dev team!
MIT License
1.66k stars 284 forks source link

Issue w/ PeftModel.from_pretrained #7

Closed bkj closed 4 months ago

bkj commented 9 months ago

When I run the tutorial here: https://github.com/brevdev/notebooks/blob/main/mistral-finetune.ipynb everything works until

ft_model = PeftModel.from_pretrained(base_model, "mistral-viggo-finetune/checkpoint-950")

which gives me:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/paperspace/.local/lib/python3.11/site-packages/peft/peft_model.py", line 332, in from_pretrained
    model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
  File "/home/paperspace/.local/lib/python3.11/site-packages/peft/peft_model.py", line 632, in load_adapter
    load_result = set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/paperspace/.local/lib/python3.11/site-packages/peft/utils/save_and_load.py", line 158, in set_peft_model_state_dict
    load_result = model.load_state_dict(peft_model_state_dict, strict=False)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2027, in load_state_dict
    load(self, state_dict)
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2015, in load
    load(child, child_state_dict, child_prefix)
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2015, in load
    load(child, child_state_dict, child_prefix)
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2015, in load
    load(child, child_state_dict, child_prefix)
  [Previous line repeated 5 more times]
  File "/home/paperspace/.anaconda/envs/mft_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2009, in load
    module._load_from_state_dict(
  File "/home/paperspace/.local/lib/python3.11/site-packages/bitsandbytes/nn/modules.py", line 256, in _load_from_state_dict
    self.weight, state_dict = bnb.nn.Params4bit.from_state_dict(
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/paperspace/.local/lib/python3.11/site-packages/bitsandbytes/nn/modules.py", line 158, in from_state_dict
    data = state_dict.pop(prefix.rstrip('.'))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'base_model.model.model.layers.0.self_attn.q_proj.base_layer.weight'

I'm running

transformers==4.34.0
torch==2.0.1
...

(not sure what other package versions are relevant, but happy to share)

Anyone have any thoughts? Thanks!

bkj commented 9 months ago

Seems like disabling quantization_config in AutoModelForCausalLM.from_pretrained gets the model to load + seems to give decent results?

model = AutoModelForCausalLM.from_pretrained(
    base_model_id,  # Mistral, same as before
    # quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
)

But ... even if it loads, this may not be handling quantization correctly ...