Open dhandhalyabhavik opened 1 month ago
It seems that you did not experiment with changing x,y to y,x. It seems that you should use
print (get_score(a, b)), print (get_score(b, a))
for comparison.
I did, look at these lines,
print (get_score(a, b)) # in case 1
0.7585506439208984
print (get_score(b, a)) # in case 2
0.17746654152870178
It's amazing that there is such a phenomenon!
Is it because the LLM replies differently if ranking/ordering of content is different in a rag application?
I don't know much about this part. Theoretically, the distance between the two vectors should be calculated to get the score. Swapping the positions should not affect the score.
We trained a cross-encoder model to evaluate similarity, where conceptually the pairs should ignore their positions. However, since it uses BERT, there could be some unusual behavior, as it's a lightweight transformer without any constraints to enforce this.
Current Behavior
Using the default onnx model, Score function
Case 1:
Case 2:
Just changed x,y to y,x while passing argument to get_score, why drastic changes in scores?
Expected Behavior
No response
Steps To Reproduce
No response
Environment
No response
Anything else?
No response