MaartenGr / KeyBERT

Minimal keyword extraction with BERT
https://MaartenGr.github.io/KeyBERT/
MIT License
3.47k stars 344 forks source link

Suggesting new terms for my vocabulary #177

Open hekl opened 1 year ago

hekl commented 1 year ago

Hello Maarten, I see that you can use a vocabulary too. I am interested in finding out the difference between my vocabulary and the terms that keybert can suggest. So, basically to trace which of my vocabulary terms fit the documents and what new terms I might have to use. Is that possible? Henk

MaartenGr commented 1 year ago

The vocabulary of what KeyBERT gives back are the words tokenized by the underlying tokenizer for each document. In other words, you could apply the tokenizer outside of KeyBERT to see which potential words it can return.