Closed moonisali closed 11 months ago
Hello!
Yes, this is possible. As you can see in this snippet, a SetFitModel
instance contains a model_body
as well as a model_head
:
https://github.com/huggingface/setfit/blob/efef17e91f56fae611c221657bcd35d5123ac9fd/src/setfit/modeling.py#L188-L202
This body is always a SentenceTransformer
, exactly like from the link that you sent. This means that you can perform the following:
from sentence_transformers import util
from setfit import SetFitModel
# Load model from the Hub
model = SetFitModel.pretrained(...)
# Optionally train the model
# trainer = SetFitTrainer(
# model,
# ...,
# )
# trainer.train()
# Copied and modified from https://www.sbert.net/docs/usage/semantic_textual_similarity.html
# Two lists of sentences
sentences1 = ['The cat sits outside',
'A man is playing guitar',
'The new movie is awesome']
sentences2 = ['The dog plays in the garden',
'A woman watches TV',
'The new movie is so great']
# Compute embedding for both lists
embeddings1 = model.model_body.encode(sentences1, convert_to_tensor=True)
embeddings2 = model.model_body.encode(sentences2, convert_to_tensor=True)
# Compute cosine-similarities
cosine_scores = util.cos_sim(embeddings1, embeddings2)
SetFitModel.encode(...)
for getting the embeddings from a SetFit model (or rather, from its finetuned Sentence Transformer body). It should be included in the upcoming release this week!Closed via #439
Hey First of All, Thank You For This Great Package!
IMy task relates to semantic similarity, in which I find 'closeness' of a query sentence to a list of candidate sentences. Something like shown here I wanted to know if there was a way to extract embeddings from a 'trained SetFit' model and then instead of utilizing the classification head just compute similarity of a given query sentences to the embeddings in SetFit.
Awaiting your answer, Thanks again