UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.13k stars 2.46k forks source link

How to use model in HuggingFace Transformers #1638

Open Ponyo1 opened 2 years ago

Ponyo1 commented 2 years ago

Hi, from sentence_transformers import SentenceTransformer sentences = ["This is an example sentence", "Each sentence is converted"] model = SentenceTransformer('sentence-transformers/distiluse-base-multilingual-cased-v2') embeddings = model.encode(sentences) print(embeddings)

This is the code in Sentence-Transformers.

But how should I use model in HuggingFace Transformers, For example :

tokenizer = BertTokenizer.from_pretrained('distiluse-base-multilingual-cased-v2') model = BertModel.from_pretrained('distiluse-base-multilingual-cased-v2') sentences = ["This is an example sentence", "Each sentence is converted"] # Tokenize sentences encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt') # Compute token embeddings with torch.no_grad(): model_output = model(**encoded_input)

and then?

How to gurantee two computing results equal

imamnurby commented 2 years ago

@Ponyo1 I think they mention the example on how to use the model using HF on this link: https://huggingface.co/sentence-transformers/all-mpnet-base-v2#usage-huggingface-transformers. You may need to adapt the code for your use-case.

I haven't tried and confirmed whether the result is similar though.