neulab / ExplainaBoard

Interpretable Evaluation for AI Systems
MIT License
361 stars 36 forks source link

Better support for analyzing retrieve-and-read based QA systems #55

Open zorazrw opened 2 years ago

zorazrw commented 2 years ago

Is there a way to enable analysis for open-domain question answering datasets? Or at least on the Reading Comprehension (RC) side, given different retrieved contexts from multiple retrieval models, to use/submit different versions of the context dataset but for the same RC task.

neubig commented 2 years ago

Thanks @zorazrw ! Could you give a slightly more detailed example of what this would look like for completeness? I think the "new tasks" part is covered by this https://github.com/neulab/ExplainaBoard/issues/54

but you also want new functionality concerning handling retrieved contexts in retrieval-based QA systems, which is an interesting problem.