Open orenpapers opened 4 years ago
You just need to load one of the multi-lingual models: https://github.com/UKPLab/sentence-transformers#multilingual-models
Then you can use it as shown in the examples. Just input Hebrew instead of English.
@nreimers I understand but can you please give example of the proper syntax? Tried various variations but didn't work, for example:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('xlm-r-he-bert-base-nli-stsb-mean-tokens')
Got:
HTTPError: 404 Client Error: Not Found for url: https://public.ukp.informatik.tu-darmstadt.de/reimers/sentence-transformers/v0.2/xlm-r-he-bert-base-nli-stsb-mean-tokens.zip
There is only one model, called xlm-r-100langs-bert-base-nli-mean-tokens, that is able to process text from 100 languages. You don't need to specify your input language. Just input your text and you get an embedding, independent of your language. You can use the same model for all the listed languages and sentences with similar meaning will be close.
@nreimers How does it identify the language? please notice Hebrew is RTL so should I insert the input differentely?
It does not need to know the language. No changes needed for RTL languages
Hello, How can I use the package to get embedding of language other than English? (e.g Hebrew) I couldn't find any code example as to how to load the model Thanks