UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.09k stars 2.46k forks source link

read xlm-r-100langs-bert-base-nli-stsb-mean-tokens error #371

Open hertz-pj opened 4 years ago

hertz-pj commented 4 years ago

hello. I download the xlm-r-100langs-bert-base-nli-stsb-mean-tokens. And I use "SentenceTransformer('pretrained-model/xlm-r-100langs-bert-base-nli-stsb-mean-tokens/0_Transformer')" and the error is Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/dockerdata/brettcheng/anaconda2/envs/mqa/lib/python3.8/site-packages/sentence_transformers/SentenceTransformer.py", line 74, in __init__ if config['__version__'] > __version__: KeyError: '__version__'

nreimers commented 4 years ago

Hi, you must use:

SentenceTransformer('pretrained-model/xlm-r-100langs-bert-base-nli-stsb-mean-tokens')
thiziri commented 3 years ago

I have a similar problem: 1- I've used the git submodule add https://huggingface.co/bert-base-multilingual-uncased command to add it as a submodule to my repos 2- I've put it in a directory whose name is: pretrained/mbert/ 3- I've used the following code to use it:

from sentence_transformers import SentenceTransformer

def embed_text(sentences, pretrained="../pretrained/mbert/bert-base-multilingual-cased"): 
    """
    Computes the embeddings of the different sentences in input.
    :param sentences: list, of sentences
    :param pretrained: str, the pretrained bert model
    :return: list, of list
    """

    model = SentenceTransformer(pretrained) 
    sentence_embeddings = model.encode(sentences)

    return [arr.tolist() for arr in sentence_embeddings]

I've got the following error:

model = SentenceTransformer(pretrained)  
  File "C:\ProgramData\Anaconda3\lib\site-packages\sentence_transformers\SentenceTransformer.py", line 104, in __init__
    if config['__version__'] > __version__:
KeyError: '__version__'
nreimers commented 3 years ago

When you use a transformer model, you have to follow the code to create a sentence transformer model from it: https://www.sbert.net/docs/training/overview.html#creating-networks-from-scratch

parthplc commented 3 years ago

hey, @PeijiYang were you able to solve the issue?

baiziyuandyufei commented 3 years ago

@nreimers is right!

my code

from sentence_transformers import SentenceTransformer, models

model_path = 'xlm-r-100langs-bert-base-nli-stsb-mean-tokens'
word_embedding_model = models.Transformer(model_path, max_seq_length=256)
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension())
model = SentenceTransformer(modules=[word_embedding_model, pooling_model])
embeddings = model.encode(['Hello World', 'Hallo Welt', 'Hola mundo'])
print(embeddings)