Closed hjing100 closed 3 years ago
Hi, thanks for using pke.
Chinese is not supported in nltk, that's why you get the errors.
You can pass your custom stoplist in candidate_selection
, candidate_filtering
(if you use it) and candidate_weighting
(if the method needs stopwords).
Not filtering candidates based on stopwords can alter the keyphrases scores for methods that compute scores globally (graph based methods, Tf-Idf if computing the Tf-Idf matrix with stopwords), but it won't change anything for EmbedRank for example, because the score is not dependent of the other candidates.
WARNING:root:No stopwords for 'zh' language. WARNING:root:Please provide custom stoplist if willing to use stopwords. Or update nltk's
stopwotk.download('stopwords')
WARNING:root:No stemmer for 'zh' language. WARNING:root:Stemming will not be applied.