New task: IR - Githubissues

neulab / ExplainaBoard

Interpretable Evaluation for AI Systems

MIT License

361 stars 36 forks source link

Open neubig opened 2 years ago

neubig commented 2 years ago

This is not super-high priority, but it'd be nice to be able to analyze IR tasks. Some example benchmarks include

pfliu-nlp commented 2 years ago

Sounds good. A good idea is that we can start with our Dataset Finder dataset and models.