Evaluate an LLM reranker after finetuning

FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs

MIT License

7.76k stars 564 forks source link

Evaluate an LLM reranker after finetuning #896

Open majdabd opened 5 months ago

majdabd commented 5 months ago

Hello, is there a way to evaluate an LLM reranker after I finetune it on my own training dataset? Also, how should the test be structured? Same as the training data (e,.g. toy_finetune_data.jsonl)? Thank you

staoxiao commented 5 months ago

Hi, @majdabd , currently, there is no simple script for evaluating the reranker model. You can use reranker model to compute scores for pairs, and then sort them, finally compute the metric which can refer to the evaluation script of the embedding model: https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/baai_general_embedding/finetune/eval_msmarco.py#L250-L260 . We plan to add an evaluation script in the future.