How to use myself stopwords?

nlp = spacy.load('zh_core_web_sm')

# initialize keyphrase extraction model, here TopicRank
extractor = pke.unsupervised.YAKE()  # TopicRank

# load the content of the document, here document is expected to be in raw
# format (i.e. a simple text file) and preprocessing is carried out using spacy
extractor.load_document(input=inputfile, language='zh',spacy_model=nlp) 

if my_stoplist!=None:            
    # 使用自定义的停用词词表
    extractor.candidate_filtering(stoplist=my_stoplist)

# keyphrase candidate selection, in the case of TopicRank: sequences of nouns
# and adjectives (i.e. `(Noun|Adj)*`)
extractor.candidate_selection()

# candidate weighting, in the case of TopicRank: using a random walk algorithm
extractor.candidate_weighting()

# N-best selection, keyphrases contains the 10 highest scored candidates as
# (keyphrase, score) tuples
keyphrases = extractor.get_n_best(n=3)

WARNING:root:No stopwords for 'zh' language. WARNING:root:Please provide custom stoplist if willing to use stopwords. Or update nltk's stopwotk.download('stopwords') WARNING:root:No stemmer for 'zh' language. WARNING:root:Stemming will not be applied.

boudinfl / pke

How to use myself stopwords? #161