Error Analysis - Githubissues

UMass-Meta-LLM-Eval / llm_eval

A comprehensive study of the LLM-as-a-judge paradigm in a controlled setup that reveals new results about its strengths and weaknesses.

https://arxiv.org/abs/2406.12624

6 stars 1 forks source link

Error Analysis #50

Closed singh96aman closed 4 months ago

singh96aman commented 5 months ago

Similar to Open QA paper, we should do Error Analysis on why the questions LLM are getting wrong. We've already demarcated our categories as under specification, over specification, knowledge error..

https://arxiv.org/pdf/2305.12421

singh96aman commented 5 months ago

We can show our error analysis in a nice way like this. Thoughts ? https://proceedings.neurips.cc/paper_files/paper/2023/file/4dbb61cb68671edc4ca3712d70083b9f-Paper-Datasets_and_Benchmarks.pdf

singh96aman commented 4 months ago

Closing this for now