Reranking with ms-marco-MultiBERT-L-12 returns complex score

PrithivirajDamodaran / FlashRank

Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cross-encoders and more. Created by Prithivi Da, open for PRs & Collaborations.

Apache License 2.0

441 stars 37 forks source link

Hi. flashrank version: 0.2.4 I'm trying to implement a reranker using ms-marco-MultiBERT-L-12 model (language is not English in my case). I do the following:

passages = [
    {"id": "1", "text": "Our library is closed at 3pm.", "meta": {}},
    {"id": "2", "text": "You can buy books cheaply in our library book store.", "meta": {}},
    {"id": "3", "text": "The library address is Washington street, 7.", "meta": {}},
]
query = "How do I get to the library?"
request: RerankRequest = RerankRequest(query=query, passages=passages)
ranker = Ranker(model_name="ms-marco-MultiBERT-L-12", cache_dir="/app/src/flashrank_cache", max_length=1024)
result = ranker.rerank(request)

What I get is very unexpected:

score field is not float, but a list of floats. What does it mean?

The most relevant result is on the third place.

[
{'id': '2', 'text': 'You can buy books cheaply in our library book store.', 'meta': {}, 'score': [0.9290498495101929, 0.10876275599002838]}, 
{'id': '1', 'text': 'Our library is closed at 3pm.', 'meta': {}, 'score': [0.8656406998634338, 0.17762842774391174]}, 
{'id': '3', 'text': 'The library address is Washington street, 7.', 'meta': {}, 'score': [0.06054011359810829, 0.9271655082702637]}
]

It looks like this strange score should be sorted by its second number, in this case result would be relevant. Can you give me a clue, what I'm doing wrong?

from flashrank import Ranker, RerankRequest ranker = Ranker("ms-marco-MultiBERT-L-12", log_level="DEBUG") passages = [ {"id": "1", "text": "Our library is closed at 3pm.", "meta": {}}, {"id": "2", "text": "You can buy books cheaply in our library book store.", "meta": {}}, {"id": "3", "text": "The library address is in Washington street, 7.", "meta": {}}, ] query = "Where is the library?" request = RerankRequest(query=query, passages=passages) results = ranker.rerank(request)

{'id': '3', 'text': 'The library address is in Washington street, 7.', 'meta': {'additional': 'info3'}, 'score': 0.9984252} {'id': '1', 'text': 'Our library is closed at 3pm.', 'meta': {'additional': 'info1'}, 'score': 0.0036173377} {'id': '2', 'text': 'You can buy books cheaply in our library book store.', 'meta': {'additional': 'info2'}, 'score': 0.0020632916}

PrithivirajDamodaran / FlashRank

Reranking with ms-marco-MultiBERT-L-12 returns complex score #18