FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs
MIT License
5.92k stars 427 forks source link

Evaluate an LLM reranker after finetuning #896

Open majdabd opened 2 weeks ago

majdabd commented 2 weeks ago

Hello, is there a way to evaluate an LLM reranker after I finetune it on my own training dataset? Also, how should the test be structured? Same as the training data (e,.g. toy_finetune_data.jsonl)? Thank you

staoxiao commented 2 weeks ago

Hi, @majdabd , currently, there is no simple script for evaluating the reranker model. You can use reranker model to compute scores for pairs, and then sort them, finally compute the metric which can refer to the evaluation script of the embedding model: https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/finetune/eval_msmarco.py#L250-L260 . We plan to add an evaluation script in the future.