Closed vecorro closed 6 months ago
From the error stack you've provided, it seems like the error is occurring in the default_parser
function in eval_utils.py
. This function is trying to split the evaluation response into a score and a reasoning, and then convert the score into a float. The error message indicates that it's trying to convert an empty string to a float, which is not possible.
This could be happening because the apredict
method in CorrectnessEvaluator.aevaluate
is returning an evaluation response that doesn't contain a score. The apredict
method is supposed to return a string containing the score and the reasoning separated by a newline character. If the score is missing or not a valid float, it would cause the ValueError
you're seeing.
Here are a few things you can try to resolve this issue:
Check the apredict
method: Make sure the apredict
method in your judge_llm
is returning a string with the score and the reasoning separated by a newline character. The score should be a valid float.
Update to the latest version: You mentioned that you're using version v0.10.17 of the llama-index package. The latest version is v0.10.18. It's possible that this issue has been fixed in the latest version. You can update to the latest version using pip:
pip install --upgrade llama-index
parser_function
: If you're using a custom parser_function
, make sure it's correctly parsing the evaluation response. The default parser function expects the score to be the first line of the response and the reasoning to be the rest of the response.If none of these suggestions resolve the issue, it would be helpful if you could provide more information about your judge_llm
and the apredict
method. Specifically, what is the format of the string it's returning? This will help us better understand what might be causing the issue.
For more information, you can refer to the LlamaIndex documentation and the source code for the RagEvaluatorPack
.
Hi,
The problem seems to be at line 182 of the ~/miniconda3/envs/llm-env4/lib/python3.10/site-packages/llama_index/core/evaluation/eval_utils.py
script:
182 score_str, reasoning_str = eval_response.split("\n", 1)
I'm using Zephyr-7b or LLama-2-70b-chat models as judges. In both cases they produce evaluations containing a leading \n
character, for instance:
4.5
The generated answer provides a comprehensive overview ...
The quick fix for this is to modify line 182 by invoking the .strip()
method before the .split()
, like this:
score_str, reasoning_str = eval_response.strip().split("\n", 1)
Can you propose this fix to the RagEvaluatorPack maintainer?
@logan-markewich, this one's a bit out of my wheelhouse, could you step in and provide your guidance?
Modified the parser to make it work.
Bug Description
While running the following code block:
I get the
ValueError: could not convert string to float:
after over the 440 Q/A pairs (10 samples per batch) had been processed:Version
v0.10.17
Steps to Reproduce
Here is the list of packages I'm using:
The error stack is the following:
Relevant Logs/Tracbacks
No response