Closed klimentij closed 3 years ago
Hi, the v2 models were trained differently and no longer output scores between 0 and 1. Instead, they output the raw logits which can be any value (but tend to be between -10 and 10).
So scores around 1 are raw low for the v2 models.
Right, but I'm using it with sentence-transformers
like this:
from sentence_transformers import CrossEncoder
model = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-12-v2', max_length=512)
so as I can see in your code, it applies sigmoid to the output logits in predict
.
I also have an ONNX version of this model without sigmoid, and it produces high values (around 8-10) on these pairs.
For these models, the activation function is identity: https://github.com/UKPLab/sentence-transformers/blob/7451b0fca949721eacd37d0df6360096b6b0f222/sentence_transformers/cross_encoder/CrossEncoder.py#L66
If you use a recent version of sentence transformers.
Further, I can recommend to use the L6 version of MiniLM, it works better.
Here a colab: https://colab.research.google.com/drive/1IBWQ8oCCbeF4U-lv5Gea61JvYj2TulGd?usp=sharing
For the first doc and query, it outputs a really low score of -9.4
Oh, I'm using version 1.0.2
, that's why.
Thanks for suggesting L6, I'll give it a try!
Thank you for this great library and pre-trained models! Just wanted to share our observations, because you might find it helpful if you decide to train the next version of MiniLM cross-encoder.
We had been using
cross-encoder/ms-marco-electra-base
for some time in production and recently moved tocross-encoder/ms-marco-MiniLM-L-12-v2
. Unfortunately, we started seeing an unstable behavior when queries are proper names, so we had to downgrade back to Electra. After an investigation, we found a bunch of query-document pairs (with totally irrelevant documents), where MiniLM tends to predict a score close to 1, while Electra's score is close to 0 (which is correct).Here are some examples:
For all these examples get scores around 0.96-0.99 from
cross-encoder/ms-marco-MiniLM-L-12-v2
and around 0.0001 fromcross-encoder/ms-marco-electra-base
.