explodinggradients / ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
https://docs.ragas.io
Apache License 2.0
5.85k stars 555 forks source link

Invalid n value (currently only n = 1 is supported) #1072

Open jidechao opened 2 weeks ago

jidechao commented 2 weeks ago

[ ] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug openai.BadRequestError: Error code: 400 - {'detail': 'Invalid n value (currently only n = 1 is supported)'}

Ragas version: 0.1.10

Code to Reproduce from ragas import evaluate from langchain_openai import ChatOpenAI, OpenAI from langchain_community.embeddings import HuggingFaceBgeEmbeddings from datasets import Dataset from langchain.vectorstores.milvus import Milvus

embedding(chroma)

model_name = "BAAI/bge-m3" model_kwargs = {'device': 'cpu'} encode_kwargs = {'normalize_embeddings': True} embeddings = HuggingFaceBgeEmbeddings( model_name=model_name, model_kwargs=model_kwargs, encode_kwargs=encode_kwargs )

import nest_asyncio

nest_asyncio.apply()

llm = ChatOpenAI(model="deepseek-chat",api_key="sk-***",base_url="https://api.deepseek.com/v1" ,temperature=0, n=1)

from ragas.metrics import ( faithfulness, answer_relevancy, context_relevancy, context_recall, context_precision, )

data = { 'question': ['What is the capital of France?'], 'contexts': [['Paris is the capital of France.']], 'answer': ['Paris'], 'ground_truth': ['Paris'] } dataset = Dataset.from_dict(data)

result = evaluate( dataset = dataset, metrics=[ context_precision, context_recall, faithfulness, answer_relevancy, ], llm=llm, embeddings=embeddings, )

result

Error trace Exception in thread Thread-11: Traceback (most recent call last): File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 95, in run results = self.loop.run_until_complete(self._aresults()) File "/usr/local/lib/python3.10/dist-packages/nest_asyncio.py", line 98, in run_until_complete return f.result() File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result raise self._exception.with_traceback(self._exception_tb) File "/usr/lib/python3.10/asyncio/tasks.py", line 232, in step result = coro.send(None) File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 83, in _aresults raise e File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 78, in _aresults r = await future File "/usr/lib/python3.10/asyncio/tasks.py", line 571, in _wait_for_one return f.result() # May raise f.exception(). File "/usr/lib/python3.10/asyncio/futures.py", line 201, in result raise self._exception.with_traceback(self._exception_tb) File "/usr/lib/python3.10/asyncio/tasks.py", line 232, in __step result = coro.send(None) File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 37, in sema_coro return await coro File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 111, in wrapped_callable_async return counter, await callable(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ragas/metrics/base.py", line 125, in ascore raise e File "/usr/local/lib/python3.10/dist-packages/ragas/metrics/base.py", line 121, in ascore score = await self._ascore(row=row, callbacks=group_cm, is_async=is_async) File "/usr/local/lib/python3.10/dist-packages/ragas/metrics/_answer_relevance.py", line 152, in _ascore result = await self.llm.generate( File "/usr/local/lib/python3.10/dist-packages/ragas/llms/base.py", line 93, in generate return await agenerate_text_with_retry( File "/usr/local/lib/python3.10/dist-packages/tenacity/asyncio/init__.py", line 189, in async_wrapped return await copy(fn, *args, kwargs) File "/usr/local/lib/python3.10/dist-packages/tenacity/asyncio/init.py", line 111, in call do = await self.iter(retry_state=retry_state) File "/usr/local/lib/python3.10/dist-packages/tenacity/asyncio/init.py", line 153, in iter result = await action(retry_state) File "/usr/local/lib/python3.10/dist-packages/tenacity/_utils.py", line 99, in inner return call(*args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/tenacity/init.py", line 398, in self._add_action_func(lambda rs: rs.outcome.result()) File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result return self.get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/usr/local/lib/python3.10/dist-packages/tenacity/asyncio/init.py", line 114, in call result = await fn(args, kwargs) File "/usr/local/lib/python3.10/dist-packages/ragas/llms/base.py", line 170, in agenerate_text return await self.langchain_llm.agenerate_prompt( File "/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py", line 691, in agenerate_prompt return await self.agenerate( File "/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py", line 651, in agenerate raise exceptions[0] File "/usr/lib/python3.10/asyncio/tasks.py", line 232, in __step result = coro.send(None) File "/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py", line 836, in _agenerate_with_cache result = await self._agenerate( File "/usr/local/lib/python3.10/dist-packages/langchain_openai/chat_models/base.py", line 674, in _agenerate response = await self.async_client.create(payload) File "/usr/local/lib/python3.10/dist-packages/openai/resources/chat/completions.py", line 1289, in create return await self._post( File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 1816, in post return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls) File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 1514, in request return await self._request( File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 1610, in _request raise self._make_status_error_from_response(err.response) from None openai.BadRequestError: Error code: 400 - {'detail': 'Invalid n value (currently only n = 1 is supported)'} Expected behavior** A clear and concise description of what you expected to happen.

Additional context Add any other context about the problem here.

dosubot[bot] commented 2 weeks ago

Hello, @jidechao! I'm here to help you with any bugs or questions you have. Let's work together to solve this issue.

To resolve the openai.BadRequestError: Error code: 400 - {'detail': 'Invalid n value (currently only n = 1 is supported)'} error, ensure that the n parameter is set to 1 in your ChatOpenAI model configuration. The ragas library currently supports only n = 1 for text generation.

Here's the corrected part of your code:

llm = ChatOpenAI(
    model="deepseek-chat",
    api_key="sk-36906d31048e43c1a7ba5d7b9b45ea0f",
    base_url="https://api.deepseek.com/v1",
    temperature=0,
    n=1  # Ensure n is set to 1
)

Make sure that the n parameter is not set to any value other than 1. This should resolve the error related to the "Invalid n value" [1].

To continue talking to Dosu, mention @dosu.

jidechao commented 2 weeks ago

Hello, @jidechao! I'm here to help you with any bugs or questions you have. Let's work together to solve this issue.

To resolve the openai.BadRequestError: Error code: 400 - {'detail': 'Invalid n value (currently only n = 1 is supported)'} error, ensure that the n parameter is set to 1 in your ChatOpenAI model configuration. The ragas library currently supports only n = 1 for text generation.

Here's the corrected part of your code:

llm = ChatOpenAI(
    model="deepseek-chat",
    api_key="sk-36906d31048e43c1a7ba5d7b9b45ea0f",
    base_url="https://api.deepseek.com/v1",
    temperature=0,
    n=1  # Ensure n is set to 1
)

Make sure that the n parameter is not set to any value other than 1. This should resolve the error related to the "Invalid n value" [1].

To continue talking to Dosu, mention @dosu.

I tested it and found that three metrics can be executed. With four metrics, it reports an error:Invalid n value (currently only n = 1 is supported)

dosubot[bot] commented 2 weeks ago

The "Invalid n value" error when evaluating four metrics simultaneously is likely due to the internal handling of multiple metrics in the ragas library. Here are a few steps to ensure your setup is correct:

  1. Ensure Dataset Format: Verify that your dataset has the correct format and column names (question, contexts, answer, ground_truth).

  2. Check Metric Initialization: Ensure that all metrics are properly initialized with the required language models (LLMs) and embeddings.

  3. Set n Parameter Correctly: The n parameter in the ChatOpenAI class should be set to 1, as the library currently supports only n = 1 for text generation.

Here is an example of how to structure your dataset and call the evaluate function:

from ragas import evaluate
from datasets import Dataset
from langchain_openai import ChatOpenAI
from langchain_community.embeddings import HuggingFaceBgeEmbeddings

# Example dataset
data = {
    'question': ["What is the capital of France?"],
    'contexts': [["Paris is the capital of France."]],
    'answer': ["Paris"],
    'ground_truth': [["Paris"]]
}

dataset = Dataset.from_dict(data)

# Initialize LLM and embeddings
llm = ChatOpenAI(model="deepseek-chat", api_key="sk-*******", base_url="https://api.deepseek.com/v1", temperature=0, n=1)
embeddings = HuggingFaceBgeEmbeddings(model_name="BAAI/bge-m3", model_kwargs={'device': 'cpu'}, encode_kwargs={'normalize_embeddings': True})

# Import metrics
from ragas.metrics import faithfulness, answer_relevancy, context_precision, context_recall

# Call the evaluate function
result = evaluate(
    dataset=dataset,
    metrics=[context_precision, context_recall, faithfulness, answer_relevancy],
    llm=llm,
    embeddings=embeddings,
)

print(result)

Ensure that the n parameter is set to 1 in the ChatOpenAI configuration. This should help resolve the "Invalid n value" error by ensuring the dataset is correctly formatted and the n parameter is properly set [1][2].

To continue talking to Dosu, mention @dosu.

jidechao commented 2 weeks ago
metrics

I tested it again, and it seems that answer_relevancy has an issue, while other metrics can be executed normally.

error: image

correct: image

123gbk commented 1 week ago

I'm having the same problem.

Mervyn-1 commented 2 hours ago

same problem