flowstack-ai / genre-classifier-gpt2

GPT - GC (Genre Classifier) is a GPT-2 based model fine-tuned for classifying movie genres.
1 stars 1 forks source link

Can't load state_dict for GPT2ForSequenceClassification (Unexpected key(s) in state_dict) #1

Open xiyiyia opened 1 year ago

xiyiyia commented 1 year ago

Hi! @malik727 @Hunaid2000 I guess the problem is in the GPTGC.pt. May I get a new file of pre-trained model? Thanks a lot!

Error(s) in loading state_dict for GPT2ForSequenceClassification:
    Unexpected key(s) in state_dict: "transformer.h.0.attn.bias", "transformer.h.0.attn.masked_bias", "transformer.h.1.attn.bias", "transformer.h.1.attn.masked_bias", "transformer.h.2.attn.bias", "transformer.h.2.attn.masked_bias", "transformer.h.3.attn.bias", "transformer.h.3.attn.masked_bias", "transformer.h.4.attn.bias", "transformer.h.4.attn.masked_bias", "transformer.h.5.attn.bias", "transformer.h.5.attn.masked_bias", "transformer.h.6.attn.bias", "transformer.h.6.attn.masked_bias", "transformer.h.7.attn.bias", "transformer.h.7.attn.masked_bias", "transformer.h.8.attn.bias", "transformer.h.8.attn.masked_bias", "transformer.h.9.attn.bias", "transformer.h.9.attn.masked_bias", "transformer.h.10.attn.bias", "transformer.h.10.attn.masked_bias", "transformer.h.11.attn.bias", "transformer.h.11.attn.masked_bias". 
  File "/home/genre-detector-gpt2/GPTGC.py", line 165, in load_model
    self.model.load_state_dict(torch.load(f"{params['MODEL_DIR']}/{params['NAME']}.pt"))
  File "/home/genre-detector-gpt2/GPTGC.py", line 59, in __init__
    self.load_model()
  File "/home/genre-detector-gpt2/main.py", line 13, in <module>
    model = GPTGC(device) # Loading model for inference.
RuntimeError: Error(s) in loading state_dict for GPT2ForSequenceClassification:
    Unexpected key(s) in state_dict: "transformer.h.0.attn.bias", "transformer.h.0.attn.masked_bias", "transformer.h.1.attn.bias", "transformer.h.1.attn.masked_bias", "transformer.h.2.attn.bias", "transformer.h.2.attn.masked_bias", "transformer.h.3.attn.bias", "transformer.h.3.attn.masked_bias", "transformer.h.4.attn.bias", "transformer.h.4.attn.masked_bias", "transformer.h.5.attn.bias", "transformer.h.5.attn.masked_bias", "transformer.h.6.attn.bias", "transformer.h.6.attn.masked_bias", "transformer.h.7.attn.bias", "transformer.h.7.attn.masked_bias", "transformer.h.8.attn.bias", "transformer.h.8.attn.masked_bias", "transformer.h.9.attn.bias", "transformer.h.9.attn.masked_bias", "transformer.h.10.attn.bias", "transformer.h.10.attn.masked_bias", "transformer.h.11.attn.bias", "transformer.h.11.attn.masked_bias". 
xiyiyia commented 1 year ago

To run this code, I have made the decision to remove the bias layer. If accuracy is not the primary concern for your goal, you can implement the following change:

Replace the load_model() function in line 165 with the following code snippet:

loaded_dict = torch.load(f"{params['MODEL_DIR']}/{params['NAME']}.pt")
model_dict = self.model.state_dict()
loaded_dict = {k: v for k, v in loaded_dict.items() if k in model_dict}
model_dict.update(loaded_dict) 
self.model.load_state_dict(model_dict)

This modification will allow you to proceed with your desired testing.

malik727 commented 1 year ago

@xiyiyia Can you tell me the Python, PyTorch, and Hugging Face transformer versions you're using? I tested on the following configuration and it runs fine:

Python = 3.9.13 HuggingFace-Hub = 0.11.1 PyTorch = 1.13.1 Transformers = 4.27.4

In principle, we should've used the "save_pretrained()" and "load_pretrained()" functions provided by the transformers library as they don't bind the functionality to a specific version of the library.

To fix the unexpected keys issue we'll need to alter the key names of the stored model to match the key names expected by the GPT-2 architecture.

xiyiyia commented 1 year ago

@xiyiyia Can you tell me the Python, PyTorch, and Hugging Face transformer versions you're using? I tested on the following configuration and it runs fine:

Python = 3.9.13 HuggingFace-Hub = 0.11.1 PyTorch = 1.13.1 Transformers = 4.27.4

In principle, we should've used the "save_pretrained()" and "load_pretrained()" functions provided by the transformers library as they don't bind the functionality to a specific version of the library.

To fix the unexpected keys issue we'll need to alter the key names of the stored model to match the key names expected by the GPT-2 architecture.

Here are the versions of the packages you mentioned:

Python = 3.8.13
HuggingFace-Hub = 0.15.1
PyTorch = 1.11.0
Transformers = 4.31.0

I will create a new environment for testing.

Thanks for your nice work.