Closed jbellis closed 1 month ago
Hi @jbellis, sorry for the late answer! You are right, there was an issue when running the eval script using the latest version of vidore-benchmark
. I have merged a fix that solves the issue for your command:
vidore-benchmark evaluate-retriever --model-name bm25 --dataset-name vidore/docvqa_test_subsampled_tesseract --split test
If you're looking for an accurate reproducibility of our paper results, I would suggest using the v1.0.0
of this package 👋🏼
$ vidore-benchmark evaluate-retriever --model-name bm25 --dataset-name vidore/docvqa_test_subsampled_tesseract --split test
The only caller of get_scores_bm25 is a unit test. It's not clear how to wire this up to main.py / evaluate.py. Should I be using a different branch?