sunyilgdx / SIFRank_zh

Keyphrase or Keyword Extraction 基于预训练模型的中文关键词抽取方法(论文SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model 的中文版代码)
417 stars 80 forks source link

运行test.py时报错: size mismatch for word_emb_layer.embedding.weight #16

Closed JeremySun1224 closed 2 years ago

JeremySun1224 commented 2 years ago

您好,我在运行您提供的测试用例test.py时报错:

2021-08-15 18:28:10,586 INFO: char embedding size: 6169
2021-08-15 18:28:10,924 INFO: word embedding size: 71222
2021-08-15 18:28:16,333 INFO: Model(
  (token_embedder): ConvTokenEmbedder(
    (word_emb_layer): EmbeddingLayer(
      (embedding): Embedding(71222, 100, padding_idx=3)
    )
    (char_emb_layer): EmbeddingLayer(
      (embedding): Embedding(6169, 50, padding_idx=6166)
    )
    (convolutions): ModuleList(
      (0): Conv1d(50, 32, kernel_size=(1,), stride=(1,))
      (1): Conv1d(50, 32, kernel_size=(2,), stride=(1,))
      (2): Conv1d(50, 64, kernel_size=(3,), stride=(1,))
      (3): Conv1d(50, 128, kernel_size=(4,), stride=(1,))
      (4): Conv1d(50, 256, kernel_size=(5,), stride=(1,))
      (5): Conv1d(50, 512, kernel_size=(6,), stride=(1,))
      (6): Conv1d(50, 1024, kernel_size=(7,), stride=(1,))
    )
    (highways): Highway(
      (_layers): ModuleList(
        (0): Linear(in_features=2048, out_features=4096, bias=True)
        (1): Linear(in_features=2048, out_features=4096, bias=True)
      )
    )
    (projection): Linear(in_features=2148, out_features=512, bias=True)
  )
  (encoder): ElmobiLm(
    (forward_layer_0): LstmCellWithProjection(
      (input_linearity): Linear(in_features=512, out_features=16384, bias=False)
      (state_linearity): Linear(in_features=512, out_features=16384, bias=True)
      (state_projection): Linear(in_features=4096, out_features=512, bias=False)
    )
    (backward_layer_0): LstmCellWithProjection(
      (input_linearity): Linear(in_features=512, out_features=16384, bias=False)
      (state_linearity): Linear(in_features=512, out_features=16384, bias=True)
      (state_projection): Linear(in_features=4096, out_features=512, bias=False)
    )
    (forward_layer_1): LstmCellWithProjection(
      (input_linearity): Linear(in_features=512, out_features=16384, bias=False)
      (state_linearity): Linear(in_features=512, out_features=16384, bias=True)
      (state_projection): Linear(in_features=4096, out_features=512, bias=False)
    )
    (backward_layer_1): LstmCellWithProjection(
      (input_linearity): Linear(in_features=512, out_features=16384, bias=False)
      (state_linearity): Linear(in_features=512, out_features=16384, bias=True)
      (state_projection): Linear(in_features=4096, out_features=512, bias=False)
    )
  )
)
Traceback (most recent call last):
  File "/Users/xing.sun/PycharmProjects/SIFRank_zh/test/test.py", line 14, in <module>
    ELMO = word_emb_elmo.WordEmbeddings(model_file)
  File "/Users/xing.sun/PycharmProjects/SIFRank_zh/embeddings/word_emb_elmo.py", line 22, in __init__
    self.elmo = Embedder(model_path)
  File "/Users/xing.sun/opt/anaconda3/envs/bert4keras36/lib/python3.6/site-packages/elmoformanylangs/elmo.py", line 106, in __init__
    self.model, self.config = self.get_model()
  File "/Users/xing.sun/opt/anaconda3/envs/bert4keras36/lib/python3.6/site-packages/elmoformanylangs/elmo.py", line 182, in get_model
    model.load_model(self.model_dir)
  File "/Users/xing.sun/opt/anaconda3/envs/bert4keras36/lib/python3.6/site-packages/elmoformanylangs/frontend.py", line 207, in load_model
    map_location=lambda storage, loc: storage))
  File "/Users/xing.sun/opt/anaconda3/envs/bert4keras36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 777, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ConvTokenEmbedder:
    size mismatch for word_emb_layer.embedding.weight: copying a param with shape torch.Size([140384, 100]) from checkpoint, the shape in current model is torch.Size([71222, 100]).
    size mismatch for char_emb_layer.embedding.weight: copying a param with shape torch.Size([15889, 50]) from checkpoint, the shape in current model is torch.Size([6169, 50]).

我的运行环境是按照您README里给出的配置的。

期待您的回复,谢谢

Sent from PPHub

sunyilgdx commented 2 years ago

不好意思啊,我也太清楚这是什么问题,有没有可能是文档超长了,我这里好像没有处理超长文档的操作

yxzai commented 2 years ago

您好,抱歉这么久了还打扰您,请问您当时是如何解决这个问题的呢

Ai-Sherry commented 2 years ago

运行test.py时需要把test.py脚本放到上一层去执行吗?否则找不到embeddings是吗?

Ai-Sherry commented 2 years ago

您好,我在运行您提供的测试用例test.py时报错:

2021-08-15 18:28:10,586 INFO: char embedding size: 6169
2021-08-15 18:28:10,924 INFO: word embedding size: 71222
2021-08-15 18:28:16,333 INFO: Model(
  (token_embedder): ConvTokenEmbedder(
    (word_emb_layer): EmbeddingLayer(
      (embedding): Embedding(71222, 100, padding_idx=3)
    )
    (char_emb_layer): EmbeddingLayer(
      (embedding): Embedding(6169, 50, padding_idx=6166)
    )
    (convolutions): ModuleList(
      (0): Conv1d(50, 32, kernel_size=(1,), stride=(1,))
      (1): Conv1d(50, 32, kernel_size=(2,), stride=(1,))
      (2): Conv1d(50, 64, kernel_size=(3,), stride=(1,))
      (3): Conv1d(50, 128, kernel_size=(4,), stride=(1,))
      (4): Conv1d(50, 256, kernel_size=(5,), stride=(1,))
      (5): Conv1d(50, 512, kernel_size=(6,), stride=(1,))
      (6): Conv1d(50, 1024, kernel_size=(7,), stride=(1,))
    )
    (highways): Highway(
      (_layers): ModuleList(
        (0): Linear(in_features=2048, out_features=4096, bias=True)
        (1): Linear(in_features=2048, out_features=4096, bias=True)
      )
    )
    (projection): Linear(in_features=2148, out_features=512, bias=True)
  )
  (encoder): ElmobiLm(
    (forward_layer_0): LstmCellWithProjection(
      (input_linearity): Linear(in_features=512, out_features=16384, bias=False)
      (state_linearity): Linear(in_features=512, out_features=16384, bias=True)
      (state_projection): Linear(in_features=4096, out_features=512, bias=False)
    )
    (backward_layer_0): LstmCellWithProjection(
      (input_linearity): Linear(in_features=512, out_features=16384, bias=False)
      (state_linearity): Linear(in_features=512, out_features=16384, bias=True)
      (state_projection): Linear(in_features=4096, out_features=512, bias=False)
    )
    (forward_layer_1): LstmCellWithProjection(
      (input_linearity): Linear(in_features=512, out_features=16384, bias=False)
      (state_linearity): Linear(in_features=512, out_features=16384, bias=True)
      (state_projection): Linear(in_features=4096, out_features=512, bias=False)
    )
    (backward_layer_1): LstmCellWithProjection(
      (input_linearity): Linear(in_features=512, out_features=16384, bias=False)
      (state_linearity): Linear(in_features=512, out_features=16384, bias=True)
      (state_projection): Linear(in_features=4096, out_features=512, bias=False)
    )
  )
)
Traceback (most recent call last):
  File "/Users/xing.sun/PycharmProjects/SIFRank_zh/test/test.py", line 14, in <module>
    ELMO = word_emb_elmo.WordEmbeddings(model_file)
  File "/Users/xing.sun/PycharmProjects/SIFRank_zh/embeddings/word_emb_elmo.py", line 22, in __init__
    self.elmo = Embedder(model_path)
  File "/Users/xing.sun/opt/anaconda3/envs/bert4keras36/lib/python3.6/site-packages/elmoformanylangs/elmo.py", line 106, in __init__
    self.model, self.config = self.get_model()
  File "/Users/xing.sun/opt/anaconda3/envs/bert4keras36/lib/python3.6/site-packages/elmoformanylangs/elmo.py", line 182, in get_model
    model.load_model(self.model_dir)
  File "/Users/xing.sun/opt/anaconda3/envs/bert4keras36/lib/python3.6/site-packages/elmoformanylangs/frontend.py", line 207, in load_model
    map_location=lambda storage, loc: storage))
  File "/Users/xing.sun/opt/anaconda3/envs/bert4keras36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 777, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ConvTokenEmbedder:
  size mismatch for word_emb_layer.embedding.weight: copying a param with shape torch.Size([140384, 100]) from checkpoint, the shape in current model is torch.Size([71222, 100]).
  size mismatch for char_emb_layer.embedding.weight: copying a param with shape torch.Size([15889, 50]) from checkpoint, the shape in current model is torch.Size([6169, 50]).

我的运行环境是按照您README里给出的配置的。

期待您的回复,谢谢

Sent from PPHub

你好,请问你这个模型跑起来了吗?

sunyilgdx commented 2 years ago

运行test.py时需要把test.py脚本放到上一层去执行吗?否则找不到embeddings是吗?

直接在Pycharm里面运行的,不需要任何路径改变,需要改一下elmo.py里面的代码,Readme里面有,不知道您改了吗

Ai-Sherry commented 2 years ago

你好,我这边直接在pycharm运行,还是又很多问题,如果方便这是我的微信13390015172,希望可以进一步沟通

--

马士乾 2021年9月12日

在 2021-09-10 20:29:54,"Sun Yi" @.***> 写道:

运行test.py时需要把test.py脚本放到上一层去执行吗?否则找不到embeddings是吗?

直接在Pycharm里面运行的,不需要任何路径改变,需要改一下elmo.py里面的代码,Readme里面有,不知道您改了吗

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.