castorini / anserini

Anserini is a Lucene toolkit for reproducible information retrieval research
http://anserini.io/
Apache License 2.0
1.03k stars 457 forks source link

Improvements to fusion regression yaml #2619

Open lintool opened 1 week ago

lintool commented 1 week ago

This is a good starting point: https://github.com/castorini/anserini/blob/master/src/main/resources/fuse_regression/beir-v1.0.0-robust04.flat.bm25.fuse.bge-base-en-v1.5.bge-flat-onnx.yaml

But I have suggestions for improvements. Instead of:

runs:
  - runs/run.beir-v1.0.0-robust04.flat.bm25.topics.beir-v1.0.0-robust04.test.txt
  - runs/run.beir-v1.0.0-robust04.bge-base-en-v1.5.bge-flat-onnx.topics.beir-v1.0.0-robust04.test.txt

Maybe we can do something like:

runs:
  - name: flat-bm25
    dependency: beir-v1.0.0-robust04.flat.yaml 
    file: runs/run.beir-v1.0.0-robust04.flat.bm25.topics.beir-v1.0.0-robust04.test.txt
  - name: bge
    ...

And down further you can just use ${flat-bm25} so that if the run name changes, you only need to change in one spot?

Happy to discuss if you think otherwise?

Stefan824 commented 1 week ago

On it