Open bsariturk opened 3 months ago
When you pass candidates to KeyBERT, the only thing that you are doing is adding them as part of the CountVectorizer
vocabulary. So if you have a custom CountVectorizer
, simply add the list of candidate words to the vocabulary
parameter.
Thank you so much Maarten. I managed to use my candidates list by providing it as vocabulary to a custom vectorizer.
I'm using jieba for tokenization for my Chinese documents, as suggested here in the issues and in the documentation. It also says in the documentation that if I use a vectorizer, I cannot use a candidates lists. In that case, is there a way to use a candidates lists with Chinese documents?