gomate-community / rageval

Evaluation tools for Retrieval-augmented Generation (RAG) methods.
Apache License 2.0
136 stars 11 forks source link

List all potential test benchmarks #63

Open faneshion opened 8 months ago

faneshion commented 8 months ago

List all most used datasets in RAG researches, and we will add them to the benchmarks.

FBzzh commented 8 months ago
FBzzh commented 8 months ago
Wenshansilvia commented 8 months ago

Select and implement typical benchmarks, collect RAG papers that utilized these benchmarks, and try to reproduce evaluation result in the paper.

  1. List benchmark and related papers & metrics.
  2. Produce testset using baseline RAG in the paper. Pack testset as dataset format and upload to HuggingFace.
  3. Reproduce evaluation result in RAGEval.

Eli5 @QianHaosheng , ASQA @bugtig6351 , Fever @henan991201