sunyilgdx / SIFRank_zh

Keyphrase or Keyword Extraction 基于预训练模型的中文关键词抽取方法(论文SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model 的中文版代码)
420 stars 78 forks source link

Run the test.py, follow is the error message: #2

Closed changquanyou closed 4 years ago

changquanyou commented 4 years ago

Traceback (most recent call last): File "test.py", line 22, in keyphrases = SIFRank(text, SIF, zh_model, N=15,elmo_layers_weight=elmo_layers_weight) File "../model/method.py", line 179, in SIFRank sent_embeddings, candidate_embeddings_list = SIF.get_tokenized_sent_embeddings(text_obj,if_DS=if_DS,if_EA=if_EA) File "../embeddings/sent_emb_sif.py", line 49, in get_tokenized_sent_embeddings elmo_embeddings = context_embeddings_alignment(elmo_embeddings, tokens_segmented) File "../embeddings/sent_emb_sif.py", line 90, in context_embeddings_alignment emb = elmo_embeddings[i, 1, j, :] IndexError: too many indices for tensor of dimension 3

changquanyou commented 4 years ago

I see , though the ELMoForManyLangs git repo has fixed the issue(-2 means all layers), using pip install,this code is not changed

sunyilgdx commented 4 years ago

This issue has not been resolved, maybe you have to change the code by yourself locally.

vigosser commented 4 years ago

same issues.

sunyilgdx commented 4 years ago

The code of elmoformanylangs 0.0.3 has not been fixed, please change the code in elmo.py in the class of sents2elmo by yourself. from

if output_layer == -1:
     payload = np.average(data, axis=0)
else:
     payload = data[output_layer]

to

if output_layer == -1:
     payload = np.average(data, axis=0)
 #code changed here
 elif output_layer == -2:
     payload = data
 else:
     payload = data[output_layer]