PrithivirajDamodaran / FlashRank

Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cross-encoders and more. Created by Prithivi Da, open for PRs & Collaborations.
Apache License 2.0
441 stars 37 forks source link

Support for Custom Models like ce-esci-MiniLM-L12-v2 in FlashRank #5

Closed ulan-yisaev closed 5 months ago

ulan-yisaev commented 5 months ago

I am currently integrating a reranking solution into my Haystack RAG pipeline. While I have been using MetaRank Reranker, I am exploring FlashRank as an alternative due to its operational efficiency and the requirement of a dedicated container for MetaRank.

In my evaluations using a specific dataset (I will provide the dataset details), I observed that the MetaRank models, particularly ce-esci-MiniLM-L12-v2, perform better on this data.

I am interested to know if FlashRank supports the integration of custom models like ce-esci-MiniLM-L12-v2. The ability to use such models could greatly influence the effectiveness of FlashRank in specific use cases, especially where certain models have shown superior performance.

Looking forward to your guidance on this.

Thank you!

Used dataset and results:

from haystack import Document
documents = [
    Document(
        "A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena ."),
    Document("A wild animal races across an uncut field with a minimal amount of trees ."),
    Document(
        "People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco ."),
    Document("A man who is riding a wild horse in the rodeo is very near to falling off ."),
    Document("A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse ."),
]
...
query = "wild west"
flashranker = FlashRankReranker(cache_dir="/tmp", model_name="rank-T5-flan")
...

MetaRankReranker:
score: 0.008271879516541958, content: A wild animal races across an uncut field with a minimal amount of trees .
score: 0.0013235947117209435, content: A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse .
score: 0.0004636832163669169, content: A man who is riding a wild horse in the rodeo is very near to falling off .
score: 0.00036893945070914924, content: A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .
score: 2.5501691197860055e-05, content: People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco .
Execution Time: 0.29767465591430664 seconds

FlashRankReranker:
score: 0.5397251844406128, content: People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco .
score: 0.5245905518531799, content: A man who is riding a wild horse in the rodeo is very near to falling off .
score: 0.51319420337677, content: A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .
score: 0.47907745838165283, content: A wild animal races across an uncut field with a minimal amount of trees .
score: 0.4261687099933624, content: A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse .
Execution Time: 0.0816802978515625 seconds
ulan-yisaev commented 5 months ago

I wanted to provide an update on my efforts to integrate the metarank/ce-msmarco-MiniLM-L6-v2 model with FlashRank. I was able to manually download this model and place it in the cache_dir used by FlashRank.

After running FlashRank with this model, I observed a significant improvement in the reranking results on my dataset. Here's a brief overview of the improved results:

FlashRankReranker:
score: 0.008271878585219383, content: A wild animal races across an uncut field with a minimal amount of trees .
score: 0.0013235947117209435, content: A rodeo cowboy , wearing a cowboy hat , is being thrown off of a wild white horse .
score: 0.00046368365292437375, content: A man who is riding a wild horse in the rodeo is very near to falling off .
score: 0.00036893924698233604, content: A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .
score: 2.5501691197860055e-05, content: People line the stands which advertise Freemont 's orthopedics , a cowboy rides a light brown bucking bronco .
Execution Time: 0.012963533401489258 seconds

It would be highly beneficial if FlashRank could directly download models from the Hugging Face model hub, instead of relying on hardcoded model paths. This flexibility would greatly streamline the process of experimenting with different models and could potentially expand the use cases for FlashRank. It would allow users to easily test various models from Hugging Face and find the one that best fits their specific needs.

PrithivirajDamodaran commented 5 months ago

Thanks for reaching out, Fine-tuned models on Amazon ESCI dataset opens interesting avenues. Will add this model to the roadmap. metarank/ce-esci-MiniLM-L12-v2

ulan-yisaev commented 5 months ago

Thank you very much for your response and for considering the integration of the ce-esci-MiniLM-L12-v2 model into FlashRank's roadmap! I would also like to suggest the potential integration of a multilingual model to cater to use cases involving European languages, such as German. A model like metarank/multilingual-e5-small from Hugging Face could be a valuable asset for users dealing with multilingual contexts. I believe its inclusion could significantly enhance FlashRank's applicability in diverse linguistic environments.

PrithivirajDamodaran commented 5 months ago

Open new request for each models separately - best to track. Just the model name would suffice and keep it concise. If we see fit will add the model name to the readme under model roadmap.

ulan-yisaev commented 5 months ago

I wanted to provide a quick update regarding the integration of metarank/multilingual-e5-small into FlashRank. After further investigation, I've realized that this model is a bi-encoder, not a cross-encoder. Therefore, it wouldn't be appropriate to integrate it into FlashRank.