YJiangcm / PromCSE

Code for "Improved Universal Sentence Embeddings with Prompt-based Contrastive Learning and Energy-based Learning (EMNLP 2022)"
https://arxiv.org/abs/2203.06875v2
134 stars 16 forks source link

The huggingface checkpoints do not pass the sanity check #11

Closed Serbernari closed 1 year ago

Serbernari commented 1 year ago

Hi! I was impressed by the results described in your paper, howewer when I tryed to use your model I got very strange results:

` from sentence_transformers import SentenceTransformer sentences = ["How is data encrypted in transit?", "Does your application use a firewall?"] model = SentenceTransformer("YuxinJiang/sup-promcse-roberta-base") #also large embeddings = model.encode(sentences)

Compute cosine-similarities

cosine_scores = util.cos_sim(embeddings[0], embeddings[1]) cosine_scores

tensor([[0.9787]]) `

The same code with the sentence-transformers/all-MiniLM-L12-v1 gives way more realistic tensor([[0.1294]])

Could you help me to figure out what's went frong here?

YJiangcm commented 1 year ago

Hi, our method freezes all transformer parameters and only tunes the additional soft prompt. The checkpoint "YuxinJiang/sup-promcse-roberta-base" contains the parameters of the freezed roberta-base as well as the parameters of the soft prompts. Directly loading our model checkpoint using _sentencetransformers will only load the parameters of roberta-base. You may check it by runing

from sentence_transformers import SentenceTransformer
sentences = ["How is data encrypted in transit?", "Does your application use a firewall?"]
model = SentenceTransformer("roberta-base")
embeddings = model.encode(sentences)
#Compute cosine-similarities
cosine_scores = util.cos_sim(embeddings[0], `embeddings[1])
cosine_scores

and the cosine_scores would also be tensor([[0.9787]]).

Though our models do not fit sentence_transformers, we have released an easy-to-use python package promcse (https://pypi.org/project/promcse/), which provides functions of

(1) encode sentences into embedding vectors; (2) compute cosine simiarities between sentences; (3) given queries, retrieval top-k semantically similar sentences for each query.

You can also get a quick start at Open In Colab.

Thanks a lot for your question! I sincerely hope our work can benefit more people :)