How to get Similarity metrics

To get a similarity metric by comparing the output from your application with the golden sample data, you can use the answer_correctness metric from the ragas library. Here is an example of how to do this:

from datasets import Dataset 
from ragas.metrics import answer_correctness
from ragas import evaluate

data_samples = {
    'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
    'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'],
    'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
}
dataset = Dataset.from_dict(data_samples)
score = evaluate(dataset, metrics=[answer_correctness])
score.to_pandas()

This code will calculate the answer correctness score between the generated answers and the ground truth answers, providing a metric that ranges from 0 to 1, where a higher score indicates a better alignment.

Make sure you have your OpenAI API key set in your environment, as the answer_correctness metric uses OpenAI's API to compute the score:

import os
os.environ["OPENAI_API_KEY"] = "your-openai-key"

You can also explore other metrics provided by the ragas library, such as faithfulness, context_precision, and answer_relevancy, to get a comprehensive evaluation of your RAG pipeline.

To install the ragas library, you can use pip:

pip install ragas

Alternatively, you can install it from the source:

pip install git+https://github.com/explodinggradients/ragas

For more detailed information and examples, refer to the Ragas documentation ^[1]^[2]^[3].

To continue talking to Dosu, mention @dosu.

explodinggradients / ragas

How to get Similarity metrics #1252