sunyilgdx / SIFRank_zh

Keyphrase or Keyword Extraction 基于预训练模型的中文关键词抽取方法(论文SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model 的中文版代码)
417 stars 80 forks source link

运行test/test.py报错 #1

Closed Vincent131499 closed 4 years ago

Vincent131499 commented 4 years ago

报错信息如下: Model loaded succeed 2020-03-03 14:04:31,192 INFO: 1 batches, avg len: 153.0 Traceback (most recent call last): File "D:/my_code/github项目/MeteorMan's nlp_lib/关键词抽取/SIFRank_zh-master/test/test.py", line 22, in keyphrases = SIFRank(text, SIF, zh_model, N=15,elmo_layers_weight=elmo_layers_weight) File "D:\my_code\github项目\MeteorMan's nlp_lib\关键词抽取\SIFRank_zh-master\model\method.py", line 179, in SIFRank sent_embeddings, candidate_embeddings_list = SIF.get_tokenized_sent_embeddings(text_obj,if_DS=if_DS,if_EA=if_EA) File "D:\my_code\github项目\MeteorMan's nlp_lib\关键词抽取\SIFRank_zh-master\embeddings\sent_emb_sif_backup.py", line 49, in get_tokenized_sent_embeddings elmo_embeddings = context_embeddings_alignment(elmo_embeddings, tokens_segmented) File "D:\my_code\github项目\MeteorMan's nlp_lib\关键词抽取\SIFRank_zh-master\embeddings\sent_emb_sif_backup.py", line 90, in context_embeddings_alignment emb = elmo_embeddings[i, 1, j, :] IndexError: too many indices for tensor of dimension 3

这里面对于elmo_embeddings的处理是否存在问题?

sunyilgdx commented 4 years ago

不知道是不是这个问题#31 哈工大的ELMo代码里有个明显的错误,就是返回所有层Embeddings的时候代码写错了,建议这样修改 elmo.py里class Embedder(object)中原代码

if output_layer == -1:
        payload = np.average(data, axis=0)
else:
        payload = data[output_layer]

建议改成

if output_layer == -1:
          payload = np.average(data, axis=0)
 #code changed here
 elif output_layer == -2:
          payload = data
 else:
          payload = data[output_layer]

试一下是不是这个问题,我之前忘记备注了

Vincent131499 commented 4 years ago

ok,已解决,Thanks