MaartenGr / KeyBERT

Minimal keyword extraction with BERT
https://MaartenGr.github.io/KeyBERT/
MIT License
3.31k stars 337 forks source link

change of Language and bulk data #150

Open Adafi123 opened 1 year ago

Adafi123 commented 1 year ago

Dears, Please how can I get keybert to recognise my other language as my document in German and is affecting my output. Also please how can get Keybert to analyse over 2M words at the same time

MaartenGr commented 1 year ago

It might be worthwhile to use an embedding model that support multiple languages or German, such as those find here: https://www.sbert.net/docs/pretrained_models.html

Personally, I would advise using "paraphrase-multilingual-mpnet-base-v2".

Adafi123 commented 1 year ago

Thanks MaartenGr please can you give me sample of your command, I am getting error when I tried to use the multilanguage

MaartenGr commented 1 year ago

@Adafi123 Sure, just run the following:

from keybert import KeyBERT
kw_model = KeyBERT(model="paraphrase-multilingual-mpnet-base-v2")

You can find more about the embedding models here.