Open g-hano opened 2 months ago
The issue you're encountering with the Local Model Runner in Executor raising exceptions when using Ollama and HuggingFace models to evaluate your RAG app with Ragas version 0.1.11 and Python version 3.11.3 is likely due to the results
list being empty, which triggers an ExceptionInRunner
.
Here's a similar issue that was discussed:
raise_exceptions=False
to show only a warning message instead of raising an exception.In your case, you are already using raise_exceptions=False
, which should prevent the exceptions from stopping the execution. However, the underlying issue might still be that the models are not producing the expected results, leading to an empty results
list.
Ensure that the models are correctly configured and producing results. Additionally, verify that the environment is correctly set up, especially if running in a Jupyter-like environment, as the nest_asyncio
package may be required to handle nested event loops.
Here is a snippet of the relevant code that raises the exception when the results
list is empty:
try:
results = exec.results()
if results == []:
raise ExceptionInRunner()
except Exception as e:
raise e
Make sure that your models are correctly configured and producing results. If the issue persists, you might want to add additional logging or debugging to understand why the results
list is empty.
For more details, you can refer to the source code of the Executor
class in ragas
[1].
To continue talking to Dosu, mention @dosu.
@dosu I still same issue, now I am trying with Ollama models.
data_samples = {
'question': df['question'].tolist(),
'answer': df['answer'].tolist(),
'contexts': df['contexts'].apply(lambda x: [x] if isinstance(x, str) else x).tolist(),
'ground_truth': df['ground_truth'].tolist()
}
dataset = Dataset.from_dict(data_samples)
from langchain_community.chat_models import ChatOllama
from langchain_community.embeddings import OllamaEmbeddings
langchain_llm = ChatOllama(model="llama3.1:8b")
langchain_embeddings = OllamaEmbeddings(model="nomic-embed-text:latest")
metrics=[faithfulness,
answer_correctness,
answer_relevancy,
context_recall,
context_precision]
score = evaluate(dataset=dataset,
metrics=metrics,
llm=langchain_llm,
embeddings=langchain_embeddings,
raise_exceptions=False
)
Runner in Executor raised an exception
The issue with the Local Model Runner in the Executor
raising exceptions when using Ollama models is likely due to the raise_exceptions
attribute being set to True
. This causes any exception that occurs within the wrap_callable_with_index
method to be raised instead of being logged and handled gracefully.
To prevent exceptions from being raised and instead log them, ensure that raise_exceptions
is set to False
when initializing the Executor
. Here is the relevant code snippet from the Executor
class:
@dataclass
class Executor:
# ... other fields ...
raise_exceptions: bool = False
# ... other fields ...
def wrap_callable_with_index(self, callable: t.Callable, counter):
async def wrapped_callable_async(*args, **kwargs):
result = np.nan
try:
result = await callable(*args, **kwargs)
except MaxRetriesExceeded as e:
# this only for testset generation v2
logger.warning(f"max retries exceeded for {e.evolution}")
except Exception as e:
if self.raise_exceptions:
raise e
else:
exec_name = type(e).__name__
exec_message = str(e)
logger.error(
"Exception raised in Job[%s]: %s(%s)",
counter,
exec_name,
exec_message,
exc_info=False,
)
return counter, result
return wrapped_callable_async
Ensure that raise_exceptions
is set to False
when initializing the Executor
:
score = evaluate(
dataset=dataset,
metrics=metrics,
llm=langchain_llm,
embeddings=langchain_embeddings,
raise_exceptions=False
)
If the issue persists, it might be due to the models not producing the expected results, leading to an empty results
list. Verify that the models are correctly configured and producing results. Additionally, ensure that the environment is correctly set up, especially if running in a Jupyter-like environment, as the nest_asyncio
package may be required to handle nested event loops [1].
If you continue to face issues, consider adding additional logging or debugging to understand why the results
list is empty.
To continue talking to Dosu, mention @dosu.
@dosu I now get
Exception raised in Job[263]: TimeoutError()
Exception raised in Job[84]: TimeoutError()
Exception raised in Job[457]: TimeoutError()
Exception raised in Job[76]: TimeoutError()
To resolve the TimeoutError
exceptions raised in the Job when using Ollama models with the Local Model Runner in Executor
, you can modify the wrap_callable_with_index
method to specifically handle TimeoutError
exceptions. Here is an example of how you can do this:
def wrap_callable_with_index(self, callable: t.Callable, counter):
async def wrapped_callable_async(*args, **kwargs):
result = np.nan
try:
result = await callable(*args, **kwargs)
except MaxRetriesExceeded as e:
# this only for testset generation v2
logger.warning(f"max retries exceeded for {e.evolution}")
except TimeoutError as e:
logger.error(f"TimeoutError in Job[{counter}]: {str(e)}")
if self.raise_exceptions:
raise e
except Exception as e:
if self.raise_exceptions:
raise e
else:
exec_name = type(e).__name__
exec_message = str(e)
logger.error(
"Exception raised in Job[%s]: %s(%s)",
counter,
exec_name,
exec_message,
exc_info=False,
)
return counter, result
return wrapped_callable_async
This modification ensures that TimeoutError
exceptions are logged and handled appropriately, and optionally re-raised if self.raise_exceptions
is set to True
[1].
Additionally, you can adjust the timeout duration for jobs using the timeout
parameter in the RunConfig
class. By default, the timeout
is set to 180 seconds. You can increase this duration if needed:
from ragas import RunConfig
config = RunConfig(timeout=300) # Set timeout to 300 seconds
If you want more detailed logging about the TimeoutError
exceptions, you can enable the log_tenacity
attribute in the RunConfig
class:
run_config = RunConfig(log_tenacity=True)
This will log retry attempts, including TimeoutError
exceptions, using the tenacity
library's logging capabilities [2].
To continue talking to Dosu, mention @dosu.
Describe the bug I want to use local llms to evaluate my rag app, I have tried Ollama and HuggingFace models but neither of them is working.
Ragas version: 0.1.11 Python version: 3.11.3
Code to Reproduce
Error trace
Expected behavior Succesfuly evaluating