FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs
MIT License
7.61k stars 553 forks source link

About temperature, query_instruction_for_retrieval and passage_instruction_for_retrieval? #402

Open chuan298 opened 10 months ago

chuan298 commented 10 months ago

Hi guys, thanks for your great repo. I want to ask some question

  1. What is the similarity distribution of model when I set temperature = 0.02? Previously, I saw you say that when temperature=0.01, the similarity distribution is [0.6, 1].
  2. I fine-tuned model with query_instruction_for_retrieval="query: " and passage_instruction_for_retrieval="passage: ", so do I also need add them when inference?
staoxiao commented 10 months ago

Hi, thanks for your interest in our work!

  1. The standard cosine similarity ranges from -1 to 1. A higher temperature coefficient will make the model's score range closer to the standard range. In our experience, the similarity range for bge-zh-v1.5 is between 0.2 and 1, while for bge-en-v1.5, it is between 0.4 and 1.
  2. yes, you should set the same instruction for inference if you use instruction in fine-tuning.
dayuyang1999 commented 3 months ago

Hi, thanks for your interest in our work!

  1. The standard cosine similarity ranges from -1 to 1. A higher temperature coefficient will make the model's score range closer to the standard range. In our experience, the similarity range for bge-zh-v1.5 is between 0.2 and 1, while for bge-en-v1.5, it is between 0.4 and 1.
  2. yes, you should set the same instruction for inference if you use instruction in fine-tuning.

Hi,

"The standard cosine similarity ranges from -1 to 1. A higher temperature coefficient will make the model's score range closer to the standard range."

When I use bge-en-v1.5, I found the cosine similarity score typically ranging between 0.4 and 1. (So two random documents will have 0.4 similarity which is kind of counter-intuititve)

So why use tempreture to "reshape" the similarity distribution? Why not keep it ranges from -1 to 1, which seems to be more intuitive and clear for revealing the negative/positive relationship between documents?