sunyilgdx / SIFRank_zh

Keyphrase or Keyword Extraction 基于预训练模型的中文关键词抽取方法(论文SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model 的中文版代码)
417 stars 80 forks source link

index can't contain negative values #19

Closed xesgue closed 2 years ago

xesgue commented 2 years ago

看了之前的问题,还是没有具体解决方案,我把我的elmo.py贴出来,非常感谢! elmo.txt

sunyilgdx commented 2 years ago

可否贴出报的错 @xesgue

xesgue commented 2 years ago

Traceback (most recent call last): File "D:/algo/SIFRank_zh-master/embeddings/word_emb_elmo.py", line 38, in embs = elmo.get_tokenized_words_embeddings(sents) File "D:/algo/SIFRank_zh-master/embeddings/word_emb_elmo.py", line 29, in get_tokenized_words_embeddings elmo_embedding = [np.pad(emb, pad_width=((0,0),(0,max_len-emb.shape[1]),(0,0)) , mode='constant') for emb in elmo_embedding] File "D:/algo/SIFRank_zh-master/embeddings/word_emb_elmo.py", line 29, in elmo_embedding = [np.pad(emb, pad_width=((0,0),(0,max_len-emb.shape[1]),(0,0)) , mode='constant') for emb in elmo_embedding] File "", line 6, in pad File "D:\Anaconda\envs\baidu\lib\site-packages\numpy\lib\arraypad.py", line 748, in pad pad_width = _as_pairs(pad_width, array.ndim, as_index=True) File "D:\Anaconda\envs\baidu\lib\site-packages\numpy\lib\arraypad.py", line 519, in _as_pairs raise ValueError("index can't contain negative values") ValueError: index can't contain negative values 和之前别人报错信息一样的@sunyilgdx

xesgue commented 2 years ago

似乎找到问题了,是word_emb_elmo.py中的max_len和emb.shape[1]数值的问题,为什么我用您test的文档或者其他文档,都是显示emb.shape[1]比max_len要多1,所以按照代码逻辑相减成了负数,因此我在前面加上了abs,改成了[np.pad(emb, pad_width=((0,0),(0,abs(max_len-emb.shape[1])),(0,0)) , mode='constant') for emb in elmo_embedding],运行成功了,您知道是为什么会导致这个情况吗?

sunyilgdx commented 2 years ago

我也不太清楚呢,我们这边是正常运行的,很多其他同学通过更改elmo.py文件也可以正常运行

lmx666-gif commented 2 years ago

似乎找到问题了,是word_emb_elmo.py中的max_len和emb.shape[1]数值的问题,为什么我用您test的文档或者其他文档,都是显示emb.shape[1]比max_len要多1,所以按照代码逻辑相减成了负数,因此我在前面加上了abs,改成了[np.pad(emb, pad_width=((0,0),(0,abs(max_len-emb.shape[1])),(0,0)) , mode='constant') for emb in elmo_embedding],运行成功了,您知道是为什么会导致这个情况吗?

您好,我用您的方法之后有出现了新的问题,ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (3,2) and requested shape (2,2) 您知道是为什么吗