Make Pooling Parameterizable

tira-io / teaching-ir-with-shared-tasks

🧪 Resources coupled to ir_datasets and TIREx for IR courses that focus their hands-on labs on shared tasks.

https://tira-io.github.io/teaching-ir-with-shared-tasks/

MIT License

7 stars 2 forks source link

Make Pooling Parameterizable #11

Open janheinrichmerker opened 2 weeks ago

janheinrichmerker commented 2 weeks ago

Currently, there are some hard-coded configurations and other issues in the pooling code:

The Elasticsearch connection and index to get passage IDs by document ID are hard-coded
Some paths are hard-coded.
The set of retrieval models and re-rankers is hard-coded.

I believe we should move the configuration to the CLI options and ideally not directly rely on Elasticsearch at all (e.g., by directly retrieving from the segmented corpus).

janheinrichmerker commented 2 weeks ago

I addressed some of the issues in commits:

271f402484c7603cca0ffaa642315719710ddb92
6afd729b6b09dad4b1286e76fcdeddce1dfbddb7
af0851d873905530fdb429ebe1d3a6acb5bc7434
54fa4f914478e40e395b04085696a7190e97751c
c297147acebcd5813a96a41ba6247c30bf1c4e15

mam10eks commented 2 weeks ago

Awesome, yes, makes sense.

I will modify this to be non-hard coded for the next iteration that I will create on friday