Closed singh96aman closed 4 months ago
We can show our error analysis in a nice way like this. Thoughts ? https://proceedings.neurips.cc/paper_files/paper/2023/file/4dbb61cb68671edc4ca3712d70083b9f-Paper-Datasets_and_Benchmarks.pdf
Closing this for now
Similar to Open QA paper, we should do Error Analysis on why the questions LLM are getting wrong. We've already demarcated our categories as under specification, over specification, knowledge error..
https://arxiv.org/pdf/2305.12421