Open jidechao opened 5 months ago
Hello, @jidechao! I'm here to help you with any bugs or questions you have. Let's work together to solve this issue.
To resolve the openai.BadRequestError: Error code: 400 - {'detail': 'Invalid n value (currently only n = 1 is supported)'}
error, ensure that the n
parameter is set to 1 in your ChatOpenAI
model configuration. The ragas
library currently supports only n = 1
for text generation.
Here's the corrected part of your code:
llm = ChatOpenAI(
model="deepseek-chat",
api_key="sk-36906d31048e43c1a7ba5d7b9b45ea0f",
base_url="https://api.deepseek.com/v1",
temperature=0,
n=1 # Ensure n is set to 1
)
Make sure that the n
parameter is not set to any value other than 1. This should resolve the error related to the "Invalid n value" [1].
To continue talking to Dosu, mention @dosu.
Hello, @jidechao! I'm here to help you with any bugs or questions you have. Let's work together to solve this issue.
To resolve the
openai.BadRequestError: Error code: 400 - {'detail': 'Invalid n value (currently only n = 1 is supported)'}
error, ensure that then
parameter is set to 1 in yourChatOpenAI
model configuration. Theragas
library currently supports onlyn = 1
for text generation.Here's the corrected part of your code:
llm = ChatOpenAI( model="deepseek-chat", api_key="sk-36906d31048e43c1a7ba5d7b9b45ea0f", base_url="https://api.deepseek.com/v1", temperature=0, n=1 # Ensure n is set to 1 )
Make sure that the
n
parameter is not set to any value other than 1. This should resolve the error related to the "Invalid n value" [1].To continue talking to Dosu, mention @dosu.
I tested it and found that three metrics can be executed. With four metrics, it reports an error:Invalid n value (currently only n = 1 is supported)
The "Invalid n value" error when evaluating four metrics simultaneously is likely due to the internal handling of multiple metrics in the ragas
library. Here are a few steps to ensure your setup is correct:
Ensure Dataset Format: Verify that your dataset has the correct format and column names (question
, contexts
, answer
, ground_truth
).
Check Metric Initialization: Ensure that all metrics are properly initialized with the required language models (LLMs) and embeddings.
Set n
Parameter Correctly: The n
parameter in the ChatOpenAI
class should be set to 1, as the library currently supports only n = 1
for text generation.
Here is an example of how to structure your dataset and call the evaluate
function:
from ragas import evaluate
from datasets import Dataset
from langchain_openai import ChatOpenAI
from langchain_community.embeddings import HuggingFaceBgeEmbeddings
# Example dataset
data = {
'question': ["What is the capital of France?"],
'contexts': [["Paris is the capital of France."]],
'answer': ["Paris"],
'ground_truth': [["Paris"]]
}
dataset = Dataset.from_dict(data)
# Initialize LLM and embeddings
llm = ChatOpenAI(model="deepseek-chat", api_key="sk-*******", base_url="https://api.deepseek.com/v1", temperature=0, n=1)
embeddings = HuggingFaceBgeEmbeddings(model_name="BAAI/bge-m3", model_kwargs={'device': 'cpu'}, encode_kwargs={'normalize_embeddings': True})
# Import metrics
from ragas.metrics import faithfulness, answer_relevancy, context_precision, context_recall
# Call the evaluate function
result = evaluate(
dataset=dataset,
metrics=[context_precision, context_recall, faithfulness, answer_relevancy],
llm=llm,
embeddings=embeddings,
)
print(result)
Ensure that the n
parameter is set to 1 in the ChatOpenAI
configuration. This should help resolve the "Invalid n value" error by ensuring the dataset is correctly formatted and the n
parameter is properly set [1][2].
To continue talking to Dosu, mention @dosu.
metrics
I tested it again, and it seems that answer_relevancy has an issue, while other metrics can be executed normally.
error:
correct:
I'm having the same problem.
same problem
@jidechao @123gbk @Mervyn-1 which mode service are you using?
The issue is that use using ChatOpenAI
so ragas expects n parameter to be supported. but I realize we need a way to override it now
@jidechao @123gbk @Mervyn-1 which mode service are you using?
The issue is that use using
ChatOpenAI
so ragas expects n parameter to be supported. but I realize we need a way to override it now
I guess the problem might be with deepseek-chat, which does not support n > 1. I'm not entirely sure about that, but I just found that both @jidechao and I are using deepseek compatible mode.
Maybe we can fix this problem by calling the deepseek API n
times with 1
reply each time, rather than calling it once
with n
replies.
This issue is specifically related to the Deepseek API, which doesn't support the OpenAI API feature of generating multiple responses with a single call (the n
parameter).
I've implemented a fix by modifying the _ascore
method in src/ragas/metrics/_answer_relevance.py
. Here's the updated code:
async def _ascore(self, row: t.Dict, callbacks: Callbacks) -> float:
assert self.llm is not None, "LLM is not set"
prompt = self._create_question_gen_prompt(row)
answers = []
for _ in range(self.strictness):
result = await self.llm.generate(
prompt,
n=1,
callbacks=callbacks,
)
generated_text = result.generations[0][0].text
answer = await _output_parser.aparse(generated_text, prompt, self.llm)
if answer is not None:
answers.append(answer)
if not answers:
return np.nan
return self._calculate_score(answers, row)
This solution loops through the generation process self.strictness
number of times, each time generating a single response. This workaround allows the code to work with APIs like Deepseek that don't support the n
parameter for multiple generations.
However, it's worth noting that this approach might have performance implications when used with other models that do support generating multiple responses in a single call.
A more optimal solution would be to implement a mechanism to switch between looping and using the n
parameter based on the capabilities of the underlying LLM. But I don't know whats the best way to add this information to the LLM class.
@Xiyuche actually we do that for LLMs that don't support n
parameter and the solution is similar to what @PhillipRt has already suggested (thanks a lot btw 🙂 ❤️ )
the con is that it's hard to keep a list. For example ChatOpenAI
does support it but in this case since you're using the same class this happens.
https://github.com/explodinggradients/ragas/blob/b988f538b968b378604240d2b279f8bf4d8ecca9/src/ragas/llms/base.py#L29-L35
https://github.com/explodinggradients/ragas/blob/b988f538b968b378604240d2b279f8bf4d8ecca9/src/ragas/llms/base.py#L39-L44
I think the solution would be to have a argument inside the LLM for this.
@Xiyuche actually we do that for LLMs that don't support
n
parameter and the solution is similar to what @PhillipRt has already suggested (thanks a lot btw 🙂 ❤️ )the con is that it's hard to keep a list. For example
ChatOpenAI
does support it but in this case since you're using the same class this happens.I think the solution would be to have a argument inside the LLM for this.
Thank you, a lot of API providers like deepseek recommand using compatible mode directly , which is however, not that much perfectly compatible with openai 😂 just like n
[ ] I have checked the documentation and related resources and couldn't resolve my bug.
Describe the bug
Ragas version: 0.1.10
Code to Reproduce
result
Expected behavior A clear and concise description of what you expected to happen.
Additional context Add any other context about the problem here.