Processing CQADupstack does not work out-of-the-box

I'm using evaluate_anserini_docT5query.py. In that script it's using util.download_and_unzip then GenericDataLoader and fails with the following message when using dataset cqadupstack:

ValueError: File /home/josh/source/beir/examples/retrieval/evaluation/sparse/datasets/cqadupstack/corpus.jsonl not present! Please provide accurate file.

The CQADupstack dataset is divided up into sub-categories which is causing the above error since it contains an extra sub-directory per-category.

 ls -las ~/source/beir/examples/retrieval/evaluation/sparse/datasets/cqadupstack/
total 56
4 drwxr-xr-x 14 josh josh 4096 Jul  1 08:38 .
4 drwxr-xr-x  3 josh josh 4096 Jul  1 08:35 ..
4 drwxr-xr-x  3 josh josh 4096 Jul  1 08:38 android
4 drwxr-xr-x  3 josh josh 4096 Jul  1 08:37 english
4 drwxr-xr-x  3 josh josh 4096 Jul  1 08:35 gaming
4 drwxr-xr-x  3 josh josh 4096 Jul  1 08:37 gis
4 drwxr-xr-x  3 josh josh 4096 Jul  1 08:37 mathematica
4 drwxr-xr-x  3 josh josh 4096 Jul  1 08:37 physics
4 drwxr-xr-x  3 josh josh 4096 Jul  1 08:37 programmers
4 drwxr-xr-x  3 josh josh 4096 Jul  1 08:37 stats
4 drwxr-xr-x  3 josh josh 4096 Jul  1 08:36 tex
4 drwxr-xr-x  3 josh josh 4096 Jul  1 08:37 unix
4 drwxr-xr-x  3 josh josh 4096 Jul  1 08:37 webmasters
4 drwxr-xr-x  3 josh josh 4096 Jul  1 08:37 wordpress

How was CQADupstack used in the benchmarking for the paper and leaderboard? Was each category processed separately or was everything somehow combined into a single evaluation?

beir-cellar / beir

Processing CQADupstack does not work out-of-the-box #23