Open chuan298 opened 10 months ago
Hi, thanks for your interest in our work!
Hi, thanks for your interest in our work!
- The standard cosine similarity ranges from -1 to 1. A higher temperature coefficient will make the model's score range closer to the standard range. In our experience, the similarity range for bge-zh-v1.5 is between 0.2 and 1, while for bge-en-v1.5, it is between 0.4 and 1.
- yes, you should set the same instruction for inference if you use instruction in fine-tuning.
Hi,
"The standard cosine similarity ranges from -1 to 1. A higher temperature coefficient will make the model's score range closer to the standard range."
When I use bge-en-v1.5, I found the cosine similarity score typically ranging between 0.4 and 1. (So two random documents will have 0.4 similarity which is kind of counter-intuititve)
So why use tempreture to "reshape" the similarity distribution? Why not keep it ranges from -1 to 1, which seems to be more intuitive and clear for revealing the negative/positive relationship between documents?
Hi guys, thanks for your great repo. I want to ask some question