Open Hannibal046 opened 3 months ago
For bert-like model such as BAAI/bge-large-en-v1.5
, it works just fine.
I have observed the same issue, for individual examples, embeddings are very different, but when measuring on full dataset, the result does not change a lot.
Hello!
That's a bit odd indeed. I have heard of situations where bfloat16
results in weird performance, but never float16
. Differences of 0.0001 luckily aren't too problematic, but 0.1 would likely result in different results for downstream tasks (retrieval, classification, etc.)
I'm not sure what could be causing it.
Hi, I understand that batch encoding would bring a slightly different results compared with single sample embedding due to mask.
However, I found that for LLM-based embedder such as
infgrad/stella_en_1.5B_v5
andSalesforce/SFR-Embedding-Mistral
. The results are drastically different.You could verify with:
And this gives:
If we set it as
float16
, a common practice for MTEB, we would get a even worse alignment:So I want to know if this is a bug or something? What is the recommended practice for this?