illuin-tech / vidore-benchmark

Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.
https://huggingface.co/vidore
MIT License
137 stars 15 forks source link

How to test the BM25 retriever? #46

Closed jbellis closed 1 month ago

jbellis commented 1 month ago

$ vidore-benchmark evaluate-retriever --model-name bm25 --dataset-name vidore/docvqa_test_subsampled_tesseract --split test

Traceback (most recent call last):
  File "/home/jonathan/miniforge3/envs/vidore/lib/python3.10/site-packages/typer/main.py", line 326, in __call__
    raise e
  File "/home/jonathan/miniforge3/envs/vidore/lib/python3.10/site-packages/typer/main.py", line 309, in __call__
    return get_command(self)(*args, **kwargs)
  File "/home/jonathan/miniforge3/envs/vidore/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/jonathan/miniforge3/envs/vidore/lib/python3.10/site-packages/typer/core.py", line 723, in main
    return _main(
  File "/home/jonathan/miniforge3/envs/vidore/lib/python3.10/site-packages/typer/core.py", line 193, in _main
    rv = self.invoke(ctx)
  File "/home/jonathan/miniforge3/envs/vidore/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/jonathan/miniforge3/envs/vidore/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/jonathan/miniforge3/envs/vidore/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/jonathan/miniforge3/envs/vidore/lib/python3.10/site-packages/typer/main.py", line 692, in wrapper
    return callback(**use_params)
  File "/home/jonathan/Projects/vidore-benchmark-main/src/vidore_benchmark/main.py", line 88, in evaluate_retriever
    dataset_name: evaluate_dataset(
  File "/home/jonathan/Projects/vidore-benchmark-main/src/vidore_benchmark/evaluation/evaluate.py", line 51, in evaluate_dataset
    emb_queries = vision_retriever.forward_queries(queries, batch_size=batch_query)
  File "/home/jonathan/Projects/vidore-benchmark-main/src/vidore_benchmark/retrievers/bm25_retriever.py", line 26, in forward_queries
    raise NotImplementedError("BM25Retriever only need get_scores_bm25 method.")
NotImplementedError: BM25Retriever only need get_scores_bm25 method.

The only caller of get_scores_bm25 is a unit test. It's not clear how to wire this up to main.py / evaluate.py. Should I be using a different branch?

tonywu71 commented 1 month ago

Hi @jbellis, sorry for the late answer! You are right, there was an issue when running the eval script using the latest version of vidore-benchmark. I have merged a fix that solves the issue for your command:

vidore-benchmark evaluate-retriever --model-name bm25 --dataset-name vidore/docvqa_test_subsampled_tesseract --split test

If you're looking for an accurate reproducibility of our paper results, I would suggest using the v1.0.0 of this package 👋🏼