Open xiyiyia opened 1 year ago
To run this code, I have made the decision to remove the bias layer. If accuracy is not the primary concern for your goal, you can implement the following change:
Replace the load_model() function in line 165 with the following code snippet:
loaded_dict = torch.load(f"{params['MODEL_DIR']}/{params['NAME']}.pt")
model_dict = self.model.state_dict()
loaded_dict = {k: v for k, v in loaded_dict.items() if k in model_dict}
model_dict.update(loaded_dict)
self.model.load_state_dict(model_dict)
This modification will allow you to proceed with your desired testing.
@xiyiyia Can you tell me the Python, PyTorch, and Hugging Face transformer versions you're using? I tested on the following configuration and it runs fine:
Python = 3.9.13 HuggingFace-Hub = 0.11.1 PyTorch = 1.13.1 Transformers = 4.27.4
In principle, we should've used the "save_pretrained()
" and "load_pretrained()
" functions provided by the transformers library as they don't bind the functionality to a specific version of the library.
To fix the unexpected keys issue we'll need to alter the key names of the stored model to match the key names expected by the GPT-2 architecture.
@xiyiyia Can you tell me the Python, PyTorch, and Hugging Face transformer versions you're using? I tested on the following configuration and it runs fine:
Python = 3.9.13 HuggingFace-Hub = 0.11.1 PyTorch = 1.13.1 Transformers = 4.27.4
In principle, we should've used the "
save_pretrained()
" and "load_pretrained()
" functions provided by the transformers library as they don't bind the functionality to a specific version of the library.To fix the unexpected keys issue we'll need to alter the key names of the stored model to match the key names expected by the GPT-2 architecture.
Here are the versions of the packages you mentioned:
Python = 3.8.13
HuggingFace-Hub = 0.15.1
PyTorch = 1.11.0
Transformers = 4.31.0
I will create a new environment for testing.
Thanks for your nice work.
Hi! @malik727 @Hunaid2000 I guess the problem is in the GPTGC.pt. May I get a new file of pre-trained model? Thanks a lot!