Error while loading from PeftModel.

pranavbhat12 commented 10 months ago

I am trying the load the finetuned adapter on the llama-2-7b model.But getting the following error.This was working fine till yesterday.Tried to downgrade Peft version to 0.5.0 and transformers to 4.32.0.

Code: finetune_model=PeftModel.from_pretrained(model,"llama-finetuned/llama-7b-hf-4bit-3epochs/",device_map="auto") Error:

KeyError                                  Traceback (most recent call last)
Cell In[3], line 1
----> 1 finetune_model=PeftModel.from_pretrained(model,"llama-finetuned/llama-7b-hf-4bit-3epochs/",device_map="auto")

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/peft/peft_model.py:278, in PeftModel.from_pretrained(cls, model, model_id, adapter_name, is_trainable, config, **kwargs)
    276 else:
    277     model = MODEL_TYPE_TO_PEFT_MODEL_MAPPING[config.task_type](model, config, adapter_name)
--> 278 model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
    279 return model

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/peft/peft_model.py:557, in PeftModel.load_adapter(self, model_id, adapter_name, is_trainable, **kwargs)
    554 adapters_weights = load_peft_weights(model_id, device=torch_device, **hf_hub_download_kwargs)
    556 # load the weights into the model
--> 557 load_result = set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
    558 if (
    559     (getattr(self, "hf_device_map", None) is not None)
    560     and (len(set(self.hf_device_map.values()).intersection({"cpu", "disk"})) > 0)
    561     and len(self.peft_config) == 1
    562 ):
    563     device_map = kwargs.get("device_map", "auto")

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/peft/utils/save_and_load.py:135, in set_peft_model_state_dict(model, peft_model_state_dict, adapter_name)
    132 else:
    133     raise NotImplementedError
--> 135 load_result = model.load_state_dict(peft_model_state_dict, strict=False)
    136 if config.is_prompt_learning:
    137     model.prompt_encoder[adapter_name].embedding.load_state_dict(
    138         {"weight": peft_model_state_dict["prompt_embeddings"]}, strict=True
    139     )

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/torch/nn/modules/module.py:2027, in Module.load_state_dict(self, state_dict, strict)
   2020         out = hook(module, incompatible_keys)
   2021         assert out is None, (
   2022             "Hooks registered with ``register_load_state_dict_post_hook`` are not"
   2023             "expected to return new values, if incompatible_keys need to be modified,"
   2024             "it should be done inplace."
   2025         )
-> 2027 load(self, state_dict)
   2028 del load
   2030 if strict:

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/torch/nn/modules/module.py:2015, in Module.load_state_dict.<locals>.load(module, local_state_dict, prefix)
   2013         child_prefix = prefix + name + '.'
   2014         child_state_dict = {k: v for k, v in local_state_dict.items() if k.startswith(child_prefix)}
-> 2015         load(child, child_state_dict, child_prefix)
   2017 # Note that the hook can modify missing_keys and unexpected_keys.
   2018 incompatible_keys = _IncompatibleKeys(missing_keys, unexpected_keys)

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/torch/nn/modules/module.py:2015, in Module.load_state_dict.<locals>.load(module, local_state_dict, prefix)
   2013         child_prefix = prefix + name + '.'
   2014         child_state_dict = {k: v for k, v in local_state_dict.items() if k.startswith(child_prefix)}
-> 2015         load(child, child_state_dict, child_prefix)
   2017 # Note that the hook can modify missing_keys and unexpected_keys.
   2018 incompatible_keys = _IncompatibleKeys(missing_keys, unexpected_keys)

    [... skipping similar frames: Module.load_state_dict.<locals>.load at line 2015 (4 times)]

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/torch/nn/modules/module.py:2015, in Module.load_state_dict.<locals>.load(module, local_state_dict, prefix)
   2013         child_prefix = prefix + name + '.'
   2014         child_state_dict = {k: v for k, v in local_state_dict.items() if k.startswith(child_prefix)}
-> 2015         load(child, child_state_dict, child_prefix)
   2017 # Note that the hook can modify missing_keys and unexpected_keys.
   2018 incompatible_keys = _IncompatibleKeys(missing_keys, unexpected_keys)

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/torch/nn/modules/module.py:2009, in Module.load_state_dict.<locals>.load(module, local_state_dict, prefix)
   2007 def load(module, local_state_dict, prefix=''):
   2008     local_metadata = {} if metadata is None else metadata.get(prefix[:-1], {})
-> 2009     module._load_from_state_dict(
   2010         local_state_dict, prefix, local_metadata, True, missing_keys, unexpected_keys, error_msgs)
   2011     for name, child in module._modules.items():
   2012         if child is not None:

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/bitsandbytes/nn/modules.py:256, in Linear4bit._load_from_state_dict(self, state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs)
    253     bias_data = state_dict.pop(prefix + "bias", None)
    254     self.bias.data = bias_data.to(self.bias.data.device)
--> 256 self.weight, state_dict = bnb.nn.Params4bit.from_state_dict(
    257                 state_dict, prefix=prefix + "weight" + ".", requires_grad=False
    258             )
    259 unexpected_keys.extend(state_dict.keys())

File ~/anaconda3/envs/python3/lib/python3.10/site-packages/bitsandbytes/nn/modules.py:158, in Params4bit.from_state_dict(cls, state_dict, prefix, requires_grad)
    156 @classmethod
    157 def from_state_dict(cls, state_dict, prefix="", requires_grad=False):
--> 158     data = state_dict.pop(prefix.rstrip('.'))
    160     # extracting components for QuantState from state_dict
    161     qs_dict = {}

KeyError: 'base_model.model.model.layers.0.self_attn.q_proj.weight'

Any help would be appreciated.Thankyou.

BenjaminBossan commented 10 months ago

Tried to downgrade Peft version to 0.5.0 and transformers to 4.32.0.

I assume that those are the versions you used when training the PEFT adapter? Did this resolve the issue or was the error still the same after the downgrade? If it was still the same error, could you please also try downgrading the bitsandbytes version to the one you used for training?

Other than that, what would be helpful is if you could jump into a debugger and print the keys of the state_dict at the time that the error occurs.

pranavbhat12 commented 10 months ago

Yep this are the versions used when training the PEFT adapter.Okay I am checking by downgrading the bitsandbytes version also.

benjamin-marie commented 10 months ago

I have the exact same problem today, but using Mistral. I tried with an adapter trained with PEFT 0.5 and another trained with 0.6.

KeyError: 'base_model.model.model.layers.0.self_attn.q_proj.weight

benjamin-marie commented 10 months ago

Downgrading bitsandbytes to 0.41.1 solves the issue.

pranavbhat12 commented 10 months ago

Yep.Downgrading the bitsandbytes version worked for me.But just a concern that till yesterday it was working fine and bitsandbytes latest release was in July 23 how come this failed all of a sudden.

Thankyou @benjamin-marie @BenjaminBossan for the help.

BenjaminBossan commented 10 months ago

Thanks both of you for investigating further.

But just a concern that till yesterday it was working fine and bitsandbytes latest release was in July 23 how come this failed all of a sudden.

We had the v0.6.0 PEFT release on Nov. 3rd, so theoretically it could be a change in PEFT that makes it not work with newer bnb versions. But if downgrading PEFT to v0.5.0 did not help, that seems unlikely. There was, however, a new bnb release (v0.41.2) just a few hours ago, so it's the most likely candidate.

A quick check in the bnb code base shows that there were indeed some changes to the loading behavior quite recently, e.g.:

https://github.com/TimDettmers/bitsandbytes/commit/76b40a5c9ae708db98e8b4a13249b2806601a387

This does not necessarily mean that bnb is "at fault", it could still be that the case that we're using it wrong in PEFT.

It's probably worth it to monitor the bnb issues in the upcoming days and see if users report errors loading models independent of PEFT. Other than that, someone could do a git bisect to identify the commit that caused the issue and try to understand exactly why that is. That's going to be quite some work though.

younesbelkada commented 10 months ago

Yes for now the solution is to use bitsandbytes==0.41.1 cc @TimDettmers @poedator @Titus-von-Koeller I think the solution on bnb side is to simply ignore the state_dict in case it does not contain keys that matches the keys of the module since in PEFT we only save adapter weights

poedator commented 10 months ago

the problem comes from overriding _load_from_state_dict() in https://github.com/TimDettmers/bitsandbytes/pull/753/commits/76b40a5c9ae708db98e8b4a13249b2806601a387 Sorry about that.

I am now testing the fix reverting the problem part, will make a PR soon. today.

benjamin-marie commented 10 months ago

It seems like Tim already fixed bitsandbytes a few hours ago. It should work now.

poedator commented 10 months ago

It seems like Tim already fixed bitsandbytes a few hours ago. It should work now.

He fixed another problem then. For this one I just created https://github.com/TimDettmers/bitsandbytes/pull/864

Titus-von-Koeller commented 10 months ago

https://pypi.org/project/bitsandbytes/#history

Hi, he released some fix 3 hours ago. But that might have been the Python 3.8 type fix by @younesbelkada

I'm not at the computer right now and can't test if that's already a fix of some sort. Doesn't seem like it though.

I wrote to Tim to take a look before the end of his day, it's still morning in Seattle, so if @poedator has the fix ready soon, I'm pretty sure Tim can still release it today.

Would be good to add some automation to avoid such issues in the future, e.g. checking lowest supported Python version compliance and running HF bnb tests as part of the bnb release process.

poedator commented 10 months ago

@Titus-von-Koeller - that was an urgent PR by Younes to fix another problem related to Python version. - apparently, Python 3.8 needs from typing import Dict and can not digest something like lowercase dict. Yet another reason to use 3.8 for testing.

Titus-von-Koeller commented 10 months ago

Fix was merged and released half an hour ago.

Thanks @poedator for the contribution and quick fix.

Thanks @TimDettmers @younesbelkada for the quick reaction.

BenjaminBossan commented 10 months ago

@pranavbhat12 @benjamin-marie Can you confirm that the latest bnb version solves your initial problem?

github-actions[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

huggingface / peft

Error while loading from PeftModel. #1095