run-llama / llama-hub

A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain
https://llamahub.ai/
MIT License
3.45k stars 732 forks source link

RagEvaluatorPack #683

Closed nerdai closed 11 months ago

nerdai commented 11 months ago

Description

Adds a new pack: RagEvaluatorPack:

Given:

  1. A rag_dataset (i.e., LabelledRagDataset)
  2. A query_engine (i.e., BaseQueryEngine) built off the same source Document's as the rag_dataset
  3. Optionally an LLM to be used as the judge (defaults to OpenAI gpt-4)

Returns: Benchmark results for:

(Same metrics shown in the Dataset Card)

image

Fixes # (issue)

Type of Change

Please delete options that are not relevant.

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration