explodinggradients / ragas

Supercharge Your LLM Application Evaluations 🚀
https://docs.ragas.io
Apache License 2.0
7.26k stars 745 forks source link

ValidationError with OpenAIEmbeddings in Ragas #1133

Open SushmitaSingh96 opened 3 months ago

SushmitaSingh96 commented 3 months ago

A ValidationError occurs when trying to use the evaluate function from Ragas with the OpenAIEmbeddings model. The error message suggests using AzureOpenAIEmbeddings when using Azure, but I am using the standard OpenAI API.

To Reproduce

Steps to reproduce the behavior:

  1. Use the following code from the Ragas documentation: Metrics - Answer Correctness
  2. Run the code, and observe the error.
from datasets import Dataset 
from ragas.metrics import answer_correctness
from ragas import evaluate

data_samples = {
    'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
    'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'],
    'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
}
dataset = Dataset.from_dict(data_samples)
score = evaluate(dataset, metrics=[answer_correctness])
score.to_pandas()

Error Message

  181 def embedding_factory(
  182     model: str = "text-embedding-ada-002", run_config: t.Optional[RunConfig] = None
  183 ) -> BaseRagasEmbeddings:
--> 184     openai_embeddings = OpenAIEmbeddings(model=model)
  185     if run_config is not None:
...
  343     object_setattr(__pydantic_self__, '__dict__', values)

ValidationError: 1 validation error for OpenAIEmbeddings
__root__
  If you are using Azure, please use the `AzureOpenAIEmbeddings` class. (type=value_error)

Expected behavior

The code should execute without errors, generating the evaluation scores and converting them to a pandas DataFrame.

Suggested Workaround

The issue seems to be related to the LangChain framework, and a suggested workaround can be found here.

The code change is:

embeddings = OpenAIEmbeddings(openai_api_type="openai")

Environment:

dosubot[bot] commented 3 months ago

Hey, @SushmitaSingh96! I'm here to help you with bugs, questions, and more while we wait for a human maintainer. Let's tackle this together!

To resolve the ValidationError when using the evaluate function from Ragas with the OpenAIEmbeddings model, ensure that you are passing the correct instances of the embeddings and LLM models. Since you are using the standard OpenAI API and not Azure, you should wrap your OpenAI embeddings and LLM instances with the appropriate wrappers provided by Ragas.

Here is an example of how to do this:

from langchain.embeddings import OpenAIEmbeddings
from langchain.llms import OpenAI
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from ragas.metrics import answer_correctness
from ragas import evaluate
from datasets import Dataset

# Initialize OpenAI embeddings and LLM
openai_embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")
openai_llm = OpenAI(model="text-davinci-003")

# Wrap them with Ragas wrappers
wrapped_embeddings = LangchainEmbeddingsWrapper(openai_embeddings)
wrapped_llm = LangchainLLMWrapper(openai_llm)

# Prepare your data
data_samples = {
    'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
    'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'],
    'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
}
dataset = Dataset.from_dict(data_samples)

# Perform evaluation
score = evaluate(dataset, metrics=[answer_correctness], llm=wrapped_llm, embeddings=wrapped_embeddings)
print(score.to_pandas())

This code ensures that the OpenAI embeddings and LLM are correctly wrapped and passed to the evaluate function, which should resolve the ValidationError [1][2].

To continue talking to Dosu, mention @dosu.

jjmachan commented 3 months ago

thanks for reporting this @SushmitaSingh96, seems like the fix you suggested should help but not sure why I don't see it in mine

have you set OPENAI_API_TYPE in your env ?