Anni-Zou / DocBench

DocBench: A Benchmark for Evaluating LLM-based Document Reading Systems
15 stars 3 forks source link

Do we have information about the domains (Academia, Finance, Government, Law, and News) for the dataset? #1

Open cin-klein opened 3 weeks ago

cin-klein commented 3 weeks ago

I want to calculate our system benchmark on each domain, but I don't see their information in metadata or qa.jsonl file, can you provide it to me? @Anni-Zou

Anni-Zou commented 3 weeks ago

Hi, I have updated the analysis in eval_result.ipynb. You can refer to it for relevant information. Additionally, you can directly output your evaluation analysis for each domain and QA type using your own results.