explodinggradients / ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
https://docs.ragas.io
Apache License 2.0
6.48k stars 632 forks source link

Error comming in example code for Answer Correctness #1118

Closed ekaanshkhosla closed 1 month ago

ekaanshkhosla commented 1 month ago

[ ] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug Error comming in example code for Answer Correctness

Code:

from datasets import Dataset 
from ragas.metrics import faithfulness, answer_correctness
from ragas import evaluate

data_samples = {
    'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
    'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'],
    'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
}
dataset = Dataset.from_dict(data_samples)
score = evaluate(dataset,metrics=[answer_correctness])
score.to_pandas()

Error:

Exception in thread Thread-10:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/opt/conda/lib/python3.10/site-packages/ragas/executor.py", line 87, in run
    results = self.loop.run_until_complete(self._aresults())
  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 625, in run_until_complete
    self._check_running()
  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 584, in _check_running
    raise RuntimeError('This event loop is already running')
RuntimeError: This event loop is already running
/opt/conda/lib/python3.10/threading.py:1018: RuntimeWarning: coroutine 'Runner._aresults' was never awaited
  self._invoke_excepthook(self)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
/opt/conda/lib/python3.10/genericpath.py:77: RuntimeWarning: coroutine 'Executor.wrap_callable_with_index.<locals>.wrapped_callable_async' was never awaited
  m = tuple(map(os.fspath, m))
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
---------------------------------------------------------------------------
ExceptionInRunner                         Traceback (most recent call last)
Cell In[11], line 11
      5 data_samples = {
      6     'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
      7     'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'],
      8     'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
      9 }
     10 dataset = Dataset.from_dict(data_samples)
---> 11 score = evaluate(dataset,metrics=[answer_correctness])
     12 score.to_pandas()

File /opt/conda/lib/python3.10/site-packages/ragas/evaluation.py:255, in evaluate(dataset, metrics, llm, embeddings, callbacks, in_ci, is_async, run_config, raise_exceptions, column_map)
    252     if not evaluation_group_cm.ended:
    253         evaluation_rm.on_chain_error(e)
--> 255     raise e
    256 else:
    257     result = Result(
    258         scores=Dataset.from_list(scores),
    259         dataset=dataset,
    260         binary_columns=binary_metrics,
    261     )

File /opt/conda/lib/python3.10/site-packages/ragas/evaluation.py:237, in evaluate(dataset, metrics, llm, embeddings, callbacks, in_ci, is_async, run_config, raise_exceptions, column_map)
    235 results = executor.results()
    236 if results == []:
--> 237     raise ExceptionInRunner()
    239 # convert results to dataset_like
    240 for i, _ in enumerate(dataset):

ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass `raise_exceptions=False` incase you want to show only a warning message instead.
/opt/conda/lib/python3.10/site-packages/ipykernel/iostream.py:122: RuntimeWarning: coroutine 'as_completed.<locals>.sema_coro' was never awaited
  await self._event_pipe_gc()
RuntimeWarning: Enable tracemalloc to get the object allocation traceback

Why this error is comming? Earlier it was not comming.

Ragas version: Python version:

Code to Reproduce Share code to reproduce the issue

Error trace

Expected behavior A clear and concise description of what you expected to happen.

Additional context Add any other context about the problem here.

dosubot[bot] commented 1 month ago

The error occurs because the evaluate function is trying to run an event loop that is already active. This is common in environments like Jupyter notebooks, where an event loop is already running.

To resolve this, you can set the is_async parameter to False when calling the evaluate function. This will run the evaluation synchronously and avoid conflicts with the existing event loop.

Here is the modified code:

from datasets import Dataset 
from ragas.metrics import faithfulness, answer_correctness
from ragas import evaluate

data_samples = {
    'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
    'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'],
    'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
}
dataset = Dataset.from_dict(data_samples)
score = evaluate(dataset, metrics=[answer_correctness], is_async=False)
score.to_pandas()

By setting is_async to False, the evaluation will run synchronously, preventing the RuntimeError related to the event loop [1].

To continue talking to Dosu, mention @dosu.

ekaanshkhosla commented 1 month ago

Still error comming

`from datasets import Dataset from ragas.metrics import faithfulness, answer_correctness from ragas import evaluate

data_samples = { 'question': ['When and where was Einstein born?'], 'answer': ['In 1879, Einstein was born in Germany.'], 'ground_truth': ['Einstein was born in 1879 in Germany.'] } dataset = Dataset.from_dict(data_samples) score = evaluate(dataset,metrics=[answer_correctness], is_async=False) score.to_pandas()`

Exception in thread Thread-14: Traceback (most recent call last): File "/opt/conda/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/opt/conda/lib/python3.10/site-packages/ragas/executor.py", line 87, in run results = self.loop.run_until_complete(self._aresults()) File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 625, in run_until_complete self._check_running() File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 584, in _check_running raise RuntimeError('This event loop is already running') RuntimeError: This event loop is already running /opt/conda/lib/python3.10/threading.py:1018: RuntimeWarning: coroutine 'Runner._aresults' was never awaited self._invoke_excepthook(self) RuntimeWarning: Enable tracemalloc to get the object allocation traceback

ExceptionInRunner Traceback (most recent call last) Cell In[17], line 11 5 data_samples = { 6 'question': ['When and where was Einstein born?'], 7 'answer': ['In 1879, Einstein was born in Germany.'], 8 'ground_truth': ['Einstein was born in 1879 in Germany.'] 9 } 10 dataset = Dataset.from_dict(data_samples) ---> 11 score = evaluate(dataset,metrics=[answer_correctness], is_async=False) 12 score.to_pandas()

File /opt/conda/lib/python3.10/site-packages/ragas/evaluation.py:255, in evaluate(dataset, metrics, llm, embeddings, callbacks, in_ci, is_async, run_config, raise_exceptions, column_map) 252 if not evaluation_group_cm.ended: 253 evaluation_rm.on_chain_error(e) --> 255 raise e 256 else: 257 result = Result( 258 scores=Dataset.from_list(scores), 259 dataset=dataset, 260 binary_columns=binary_metrics, 261 )

File /opt/conda/lib/python3.10/site-packages/ragas/evaluation.py:237, in evaluate(dataset, metrics, llm, embeddings, callbacks, in_ci, is_async, run_config, raise_exceptions, column_map) 235 results = executor.results() 236 if results == []: --> 237 raise ExceptionInRunner() 239 # convert results to datasetlike 240 for i, in enumerate(dataset):

ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass raise_exceptions=False incase you want to show only a warning message instead. /opt/conda/lib/python3.10/site-packages/ipykernel/iostream.py:122: RuntimeWarning: coroutine 'as_completed..sema_coro' was never awaited await self._event_pipe_gc() RuntimeWarning: Enable tracemalloc to get the object allocation traceback

ekaanshkhosla commented 1 month ago

Even this is not working

from datasets import Dataset from ragas.metrics import faithfulness from ragas import evaluate

data_samples = { 'question': ['Where and when was Einstein born?'], 'answer': ['Einstein was born in Germany on 20th March 1879.'], 'contexts' : [['Albert Einstein (born 14 March 1879) was a German-born theoretical physicist, widely held to be one of the greatest and most influential scientists of all time']] } dataset = Dataset.from_dict(data_samples) score = evaluate(dataset,metrics=[faithfulness]) score.to_pandas()

error comming

Exception in thread Thread-15: Traceback (most recent call last): File "/opt/conda/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/opt/conda/lib/python3.10/site-packages/ragas/executor.py", line 87, in run results = self.loop.run_until_complete(self._aresults()) File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 625, in run_until_complete self._check_running() File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 584, in _check_running raise RuntimeError('This event loop is already running') RuntimeError: This event loop is already running /opt/conda/lib/python3.10/threading.py:1018: RuntimeWarning: coroutine 'Runner._aresults' was never awaited self._invoke_excepthook(self) RuntimeWarning: Enable tracemalloc to get the object allocation traceback

ExceptionInRunner Traceback (most recent call last) Cell In[18], line 11 5 data_samples = { 6 'question': ['Where and when was Einstein born?'], 7 'answer': ['Einstein was born in Germany on 20th March 1879.'], 8 'contexts' : [['Albert Einstein (born 14 March 1879) was a German-born theoretical physicist, widely held to be one of the greatest and most influential scientists of all time']] 9 } 10 dataset = Dataset.from_dict(data_samples) ---> 11 score = evaluate(dataset,metrics=[faithfulness]) 12 score.to_pandas()

File /opt/conda/lib/python3.10/site-packages/ragas/evaluation.py:255, in evaluate(dataset, metrics, llm, embeddings, callbacks, in_ci, is_async, run_config, raise_exceptions, column_map) 252 if not evaluation_group_cm.ended: 253 evaluation_rm.on_chain_error(e) --> 255 raise e 256 else: 257 result = Result( 258 scores=Dataset.from_list(scores), 259 dataset=dataset, 260 binary_columns=binary_metrics, 261 )

File /opt/conda/lib/python3.10/site-packages/ragas/evaluation.py:237, in evaluate(dataset, metrics, llm, embeddings, callbacks, in_ci, is_async, run_config, raise_exceptions, column_map) 235 results = executor.results() 236 if results == []: --> 237 raise ExceptionInRunner() 239 # convert results to datasetlike 240 for i, in enumerate(dataset):

ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass raise_exceptions=False incase you want to show only a warning message instead.

emreds commented 1 month ago

I had the same problem with azure deployed gpt3.5 and gpt4. I have noticed that it occurs only on jupyter notebook. Running it as a plain script solved the problem for me while using gpt3.5.

However, it kept occurring for the gpt4. I suspect that it timeouts somewhere in the code. Not sure.

arianpasquali commented 1 month ago

Same problem here. Can't run this on Jupyter notebooks, even with is_async disabled. Only works running via plain Python script.

jjmachan commented 1 month ago

@arianpasquali @emreds @ekaanshkhosla this was a problem with 0.1.10 version of ragas, could you try upgrading it to the latest and test out and let us know if fixes it or not?

appologies for the bug...

emreds commented 1 month ago

Thanks for the fix! It works now!

jjmachan commented 1 month ago

@emreds thanks for confirming 🙂. marking this as answered then

github-actions[bot] commented 1 month ago

It seems the issue was answered, closing this now.