RuntimeError: Error(s) in loading state_dict for DeepPunctuation: Missing key(s) in state_dict: "bert_layer.embeddings.position_ids".

FurkanGozukara commented 2 years ago

I have tried with my current installation and here the error

C:\punctuation-restoration\src>python inference.py --pretrained-model=roberta-large --weight-path=roberta-large-en.pt --language=en --in-file=data/test_en.txt --out-file=data/test_en_out.txt C:\Python399\lib\site-packages\torchaudio\backend\utils.py:62: UserWarning: No audio backend is available. warnings.warn("No audio backend is available.") loading file vocab.json from cache at C:\Users\King/.cache\huggingface\hub\models--roberta-large\snapshots\5069d8a2a32a7df4c69ef9b56348be04152a2341\vocab.json loading file merges.txt from cache at C:\Users\King/.cache\huggingface\hub\models--roberta-large\snapshots\5069d8a2a32a7df4c69ef9b56348be04152a2341\merges.txt loading file added_tokens.json from cache at None loading file special_tokens_map.json from cache at None loading file tokenizer_config.json from cache at None loading configuration file config.json from cache at C:\Users\King/.cache\huggingface\hub\models--roberta-large\snapshots\5069d8a2a32a7df4c69ef9b56348be04152a2341\config.json Model config RobertaConfig { "_name_or_path": "roberta-large", "architectures": [ "RobertaForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "bos_token_id": 0, "classifier_dropout": null, "eos_token_id": 2, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 1024, "initializer_range": 0.02, "intermediate_size": 4096, "layer_norm_eps": 1e-05, "max_position_embeddings": 514, "model_type": "roberta", "num_attention_heads": 16, "num_hidden_layers": 24, "pad_token_id": 1, "position_embedding_type": "absolute", "transformers_version": "4.22.1", "type_vocab_size": 1, "use_cache": true, "vocab_size": 50265 }

loading configuration file config.json from cache at C:\Users\King/.cache\huggingface\hub\models--roberta-large\snapshots\5069d8a2a32a7df4c69ef9b56348be04152a2341\config.json Model config RobertaConfig { "architectures": [ "RobertaForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "bos_token_id": 0, "classifier_dropout": null, "eos_token_id": 2, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 1024, "initializer_range": 0.02, "intermediate_size": 4096, "layer_norm_eps": 1e-05, "max_position_embeddings": 514, "model_type": "roberta", "num_attention_heads": 16, "num_hidden_layers": 24, "pad_token_id": 1, "position_embedding_type": "absolute", "transformers_version": "4.22.1", "type_vocab_size": 1, "use_cache": true, "vocab_size": 50265 }

loading weights file pytorch_model.bin from cache at C:\Users\King/.cache\huggingface\hub\models--roberta-large\snapshots\5069d8a2a32a7df4c69ef9b56348be04152a2341\pytorch_model.bin Some weights of the model checkpoint at roberta-large were not used when initializing RobertaModel: ['lm_head.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']

This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). All the weights of RobertaModel were initialized from the model checkpoint at roberta-large. If your task is similar to the task the model of the checkpoint was trained on, you can already use RobertaModel for predictions without further training. Traceback (most recent call last): File "C:\punctuation-restoration\src\inference.py", line 105, in inference() File "C:\punctuation-restoration\src\inference.py", line 41, in inference deep_punctuation.load_state_dict(torch.load(model_save_path)) File "C:\Python399\lib\site-packages\torch\nn\modules\module.py", line 1604, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for DeepPunctuation: Missing key(s) in state_dict: "bert_layer.embeddings.position_ids".

C:\punctuation-restoration\src>

kikkia commented 1 year ago

I am hitting a similar issue. Did you ever find a workaround?

BarAvni commented 1 year ago

You need to use transformers v2.11.0

xashru / punctuation-restoration

RuntimeError: Error(s) in loading state_dict for DeepPunctuation: Missing key(s) in state_dict: "bert_layer.embeddings.position_ids". #14