letiantian / TextRank4ZH

:deciduous_tree:从中文文本中自动提取关键词和摘要
MIT License
3.27k stars 847 forks source link

代码的bug #5

Closed qiqipipioioi closed 8 years ago

qiqipipioioi commented 8 years ago
    def get_keyphrases(self, keywords_num = 12, min_occur_num = 2):
        keywords_set = set(self.get_keywords(num=keywords_num, word_min_len = 1))

        keyphrases = set()
        for sentence_list in self.words_no_filter:
# !!!!!!           one = []
            for word in sentence_list:
                # print '/'.join(one)
                if word in keywords_set:
                    one.append(word)
                else:
                    if len(one)>1:
                        keyphrases.add(''.join(one))
                        one = []
                        continue
                    one = []
        return [phrase for phrase in keyphrases
                if self.text.count(phrase) >= min_occur_num]

楼主,你的TextRank4Keyword.py里get_keyphrases有bug,注意"# !!!!!! one = []",帮你调整了给one赋初值的位置

letiantian commented 8 years ago

我最近会看下这个问题。

letiantian commented 8 years ago

已经解决。做了一次版本更新,API有变化。欢迎使用。