Need clarification on three RAG evaluation metric

truera / trulens

Evaluation and Tracking for LLM Experiments

https://www.trulens.org/

MIT License

2.05k stars 177 forks source link

Need clarification on three RAG evaluation metric #606

Closed vr25 closed 9 months ago

vr25 commented 9 months ago

Hi:

Really interesting work on RAG evaluation metrics. However, I am unable to understand how these evaluation functions work. Could you please point me to some relevant documentation that explains the mathematical formula for all three:

Context Relevance
Groundedness
Answer Relevance

Would really appreciate your help...thanks much in advance.

joshreini1 commented 9 months ago

Hi @vr25 - these are model-based metrics.

Context relevance can be measured with smaller BERT-style models, embedding distances, or with LLMs.
Groundedness can be measured using smaller NLI models or LLMs
Answer relevance can BERT-style models, embedding distances, or with LLMs.

TruLens gives you the flexibility to use these different options for metric generation.

Does that help?

vr25 commented 9 months ago

@joshreini1 This helps...thanks for the clarification.