Lightning-AI / torchmetrics

Torchmetrics - Machine learning metrics for distributed, scalable PyTorch applications.
https://lightning.ai/docs/torchmetrics/
Apache License 2.0
2.01k stars 391 forks source link

Need higher level RAG metrics #2598

Open devinbost opened 2 weeks ago

devinbost commented 2 weeks ago

Problem & Motivation

There is a huge wave of interest around high accuracy Q&A, such as via Retrieval Augmented Generation (RAG). RAG accuracy is largely driven by how well vector search is able to retrieve the correct context to answer questions via an LLM. When evaluating embedding models, vector search retrieval metrics are helpful but insufficient because they don't reveal how well the retrieved content actually answers the target questions.

Pitch

I'd love to see an integration with a tool like our new ragulate library (Apache 2 licensed) that would simplify model evaluation on RAG Q&A: https://github.com/epinzur/ragulate/tree/main

Additional context

I was going to suggest that you integrate with trulens, but then I discovered that we built ragulate to automate much of the process of using trulens, and we'd love feedback on it.

github-actions[bot] commented 2 weeks ago

Hi! thanks for your contribution!, great first issue!

Borda commented 2 weeks ago

@devinbost, thank you for your suggestion. Indeed, it would be nice to have such meters available in TM. Said so, I would love to see your PR adding them as complete code not referring to an external package... :flamingo: