truera / trulens

Evaluation and Tracking for LLM Experiments
https://www.trulens.org/
MIT License
2.05k stars 177 forks source link

Need clarification on three RAG evaluation metric #606

Closed vr25 closed 9 months ago

vr25 commented 9 months ago

Hi:

Really interesting work on RAG evaluation metrics. However, I am unable to understand how these evaluation functions work. Could you please point me to some relevant documentation that explains the mathematical formula for all three:

  1. Context Relevance
  2. Groundedness
  3. Answer Relevance

Would really appreciate your help...thanks much in advance.

joshreini1 commented 9 months ago

Hi @vr25 - these are model-based metrics.

  1. Context relevance can be measured with smaller BERT-style models, embedding distances, or with LLMs.
  2. Groundedness can be measured using smaller NLI models or LLMs
  3. Answer relevance can BERT-style models, embedding distances, or with LLMs.

TruLens gives you the flexibility to use these different options for metric generation.

Does that help?

vr25 commented 9 months ago

@joshreini1 This helps...thanks for the clarification.