ChangwenXu98 / TransPolymer

Implementation of "TransPolymer: a Transformer-based language model for polymer property predictions" in PyTorch
MIT License
53 stars 19 forks source link

RuntimeError: Error(s) in loading state_dict for DownstreamRegression: #16

Closed veraLiuWL closed 5 months ago

veraLiuWL commented 6 months ago

Hello,I have successfully fine-tuned the pre-trained model and also output the visual attention graph of the pre-trained model, but when I run Attention_vis.py to generate the attention graph of the fine-tuned model, the following error occurs, can you please help me?

Traceback (most recent call last): File "/home/liwei/桌面/TransPolymer-master/Attention_vis.py", line 141, in <module> main(attention_config) File "/home/liwei/桌面/TransPolymer-master/Attention_vis.py", line 89, in main model.load_state_dict(checkpoint['model']) File "/home/liwei/anaconda3/envs/TransPolymer/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1604, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for DownstreamRegression: size mismatch for PretrainedModel.embeddings.word_embeddings.weight: copying a param with shape torch.Size([50908, 768]) from checkpoint, the shape in current model is torch.Size([50265, 768]).

This is my modified config_attention.yaml file in order to generate the fine-tuned model attention map. task: 'finetune' # the task to visualize the attention scores smiles: 'CCO' # the SMILES used for visualization when task=='pretrain' layer: 0 # the hidden layer for visualization when task=='pretrain' index: 8 # the index of the sequence used for visualization when task=='finetune' add_vocab_flag: False # whether to add supplementary vocab

file_path: 'data/PE_I.csv' # train file path vocab_sup_file: 'data/vocab_sup_PE_I.csv' # supplementary vocab file path model_path: 'ckpt/PE_I_best_model.pt' # finetuned model path pretrain_path: 'ckpt/pretrain.pt' # pretrained model path save_path: 'figs/attention_vis.png' # figure save path blocksize: 7 # max length of sequences after tokenization

figsize_x: 30 # the size of figure in x figsize_y: 18 # the size of figure in y fontsize: 20 # fontsize labelsize: 15 # label size rotation: 45 # rotation of figure

ChangwenXu98 commented 5 months ago

You should either load a different pretrained model checkpoint or set the add_vocab_flag to True. When fine-tuning the model on PE_I dataset, special tokens were added to the tokenizer and the number of parameters in the word_embeddings of the model was also changed, and that caused the size mismatch ([50908, 768] vs [50265, 768]). Hope this helps.