explodinggradients / ragas

Supercharge Your LLM Application Evaluations 🚀
https://docs.ragas.io
Apache License 2.0
7.33k stars 746 forks source link

feat: make general purpose metrics more general #1666

Closed jjmachan closed 6 days ago

jjmachan commented 1 week ago

Metrics Converted

a few different examples

Aspect Critic

from ragas.metrics import AspectCritic
from ragas.dataset_schema import SingleTurnSample

only_response = SingleTurnSample(
    response="The Eiffel Tower is located in Paris."
)

grammar_critic = AspectCritic(
    name="grammar",
    definition="Is the response grammatically correct?",
    llm=evaluator_llm
)

await grammar_critic.single_turn_ascore(only_response)

with reference

answer_correctness_critic = AspectCritic(
    name="answer_correctness",
    definition="Is the response and reference answer are the same?",
    llm=evaluator_llm
)

# data row
sample = SingleTurnSample(
    user_input="Where is the Eiffel Tower located?",
    response="The Eiffel Tower is located in Paris.",
    reference="London"
)
await answer_correctness_critic.single_turn_ascore(sample)

Note: this only works for multi-turn metrics for now

shahules786 commented 1 week ago