explodinggradients / ragas

Supercharge Your LLM Application Evaluations 🚀
https://docs.ragas.io
Apache License 2.0
7.43k stars 753 forks source link

AnswerCorrectness raises the AssertionError: "AssertionError: AnswerSimilarity must be set" #1336

Open xidian237 opened 2 months ago

xidian237 commented 2 months ago

[ ] I have checked the documentation and related resources and couldn't resolve my bug.

Hi experts,

I use the answer_correctness as the metric, but it failed due to AssertionError: "AssertionError: AnswerSimilarity must be set".

Ragas version: 0.1.16 Python version: 3.11.9

Code to Reproduce I want to call the answer_correctness, here is the test code, and, this aserror occurs occasionally.:

metrics = [globals()[metric_name] for metric_name in [metric_name]] result = evaluate(dataset=dataset, metrics=metrics, llm=gen_ai_hub_llm, embeddings=gen_ai_hub_embedding_model, raise_exceptions=True) print(result)

Error trace

Evaluating: 0%| | 0/1 [01:14<?, ?it/s] INFO: 10.237.90.124:57648 - "POST /v1/evaluations/dataset HTTP/1.1" 500 Internal Server Error ERROR: Exception in ASGI application Traceback (most recent call last): File "/evaluation_service/.venv/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi result = await app( # type: ignore[func-returns-value] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in call return await self.app(scope, receive, send) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in call await super().call(scope, receive, send) File "/evaluation_service/.venv/lib/python3.11/site-packages/starlette/applications.py", line 123, in call await self.middleware_stack(scope, receive, send) File "/evaluation_service/.venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in call raise exc File "/evaluation_service/.venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in call await self.app(scope, receive, _send) File "/evaluation_service/.venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in call await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/evaluation_service/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/evaluation_service/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/evaluation_service/.venv/lib/python3.11/site-packages/starlette/routing.py", line 756, in call await self.middleware_stack(scope, receive, send) File "/evaluation_service/.venv/lib/python3.11/site-packages/starlette/routing.py", line 776, in app await route.handle(scope, receive, send) File "/evaluation_service/.venv/lib/python3.11/site-packages/starlette/routing.py", line 297, in handle await self.app(scope, receive, send) File "/evaluation_service/.venv/lib/python3.11/site-packages/starlette/routing.py", line 77, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/evaluation_service/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/evaluation_service/.venv/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/evaluation_service/.venv/lib/python3.11/site-packages/starlette/routing.py", line 72, in app response = await func(request) ^^^^^^^^^^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app raw_response = await run_endpoint_function( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/fastapi/routing.py", line 193, in run_endpoint_function return await run_in_threadpool(dependant.call, values) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/starlette/concurrency.py", line 42, in run_in_threadpool return await anyio.to_thread.run_sync(func, args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 859, in run result = context.run(func, args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/evaluation_service/main.py", line 138, in evaluate_with_dataset res = ragas_eval_process(metric, question, context, expected_output, llm_output, trace) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/evaluation_service/src/evaluation/eval_by_ragas/ragas_eval_process.py", line 24, in ragas_eval_process result = evaluate(dataset=dataset, metrics=metrics, llm=gen_ai_hub_llm, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/ragas/_analytics.py", line 129, in wrapper result = func(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/ragas/evaluation.py", line 280, in evaluate raise e File "/evaluation_service/.venv/lib/python3.11/site-packages/ragas/evaluation.py", line 260, in evaluate results = executor.results() ^^^^^^^^^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/ragas/executor.py", line 116, in results results = asyncio.run(_aresults()) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/asyncio/runners.py", line 190, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/ragas/executor.py", line 111, in _aresults r = await future ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/asyncio/tasks.py", line 615, in _wait_for_one return f.result() # May raise f.exception(). ^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/ragas/executor.py", line 34, in sema_coro return await coro ^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/ragas/executor.py", line 60, in wrapped_callable_async raise e File "/evaluation_service/.venv/lib/python3.11/site-packages/ragas/executor.py", line 54, in wrapped_callable_async result = await callable(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/ragas/metrics/base.py", line 128, in ascore raise e File "/evaluation_service/.venv/lib/python3.11/site-packages/ragas/metrics/base.py", line 121, in ascore score = await asyncio.wait_for( ^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/asyncio/tasks.py", line 489, in wait_for return fut.result() ^^^^^^^^^^^^ File "/evaluation_service/.venv/lib/python3.11/site-packages/ragas/metrics/_answer_correctness.py", line 263, in _ascore assert self.answer_similarity is not None, "AnswerSimilarity must be set" ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError: AnswerSimilarity must be set

Additional context I noticed that the AnswerSimilarity will be inited:

def init(self, run_config: RunConfig): super().init(run_config) if self.answer_similarity is None and self.weights[1] != 0: self.answer_similarity = AnswerSimilarity( llm=self.llm, embeddings=self.embeddings )

dosubot[bot] commented 2 months ago

To resolve the AssertionError: "AnswerSimilarity must be set" when using the answer_correctness metric in Ragas version 0.1.16 with Python 3.11.9, you need to ensure that the AnswerSimilarity object is properly initialized. This is required when the second weight in the weights list is not zero.

Here is the relevant part of the code that checks for the AnswerSimilarity object:

if self.answer_similarity is None and self.weights[1] != 0:
    self.answer_similarity = AnswerSimilarity(
        llm=self.llm, embeddings=self.embeddings
    )

Make sure that the AnswerSimilarity object is set correctly in the init method of the AnswerCorrectness class. Here is the complete init method for reference:

def init(self, run_config: RunConfig):
    super().init(run_config)
    if self.answer_similarity is None and self.weights[1] != 0:
        self.answer_similarity = AnswerSimilarity(
            llm=self.llm, embeddings=self.embeddings
        )

Ensure that you call the init method with the appropriate RunConfig object before using the answer_correctness metric. This will initialize the AnswerSimilarity object if it is not already set and the second weight is not zero.

Here is an example of how you might initialize and use the AnswerCorrectness metric:

from ragas.run_config import RunConfig
from ragas.metrics._answer_correctness import AnswerCorrectness

# Create a RunConfig object with necessary parameters
run_config = RunConfig(llm=your_llm, embeddings=your_embeddings)

# Initialize the AnswerCorrectness metric
answer_correctness = AnswerCorrectness()
answer_correctness.init(run_config)

# Now you can use the answer_correctness metric without encountering the AssertionError

Replace your_llm and your_embeddings with the appropriate objects required for your use case [1].

To continue talking to Dosu, mention @dosu.

xidian237 commented 2 months ago

@dosu The weights is default: weights: list[float] = field(default_factory=lambda: [0.75, 0.25])

dosubot[bot] commented 2 months ago

To resolve the AssertionError: "AnswerSimilarity must be set" when the weights are set to their default values [0.75, 0.25], you need to ensure that the answer_similarity attribute is properly initialized in the AnswerCorrectness class. This is done in the init method.

Here is the relevant part of the code:

def init(self, run_config: RunConfig):
    super().init(run_config)
    if self.answer_similarity is None and self.weights[1] != 0:
        self.answer_similarity = AnswerSimilarity(
            llm=self.llm, embeddings=self.embeddings
        )

Make sure that the init method is called with the appropriate RunConfig and that llm and embeddings are properly initialized before calling this method. This will ensure that answer_similarity is set correctly, avoiding the AssertionError.

Here is an example of how you might initialize and use the AnswerCorrectness metric:

from ragas.run_config import RunConfig
from ragas.metrics._answer_correctness import AnswerCorrectness

# Assuming llm and embeddings are already initialized
run_config = RunConfig(llm=your_llm, embeddings=your_embeddings)

answer_correctness_metric = AnswerCorrectness()
answer_correctness_metric.init(run_config)

This ensures that the answer_similarity attribute is set when the weights are [0.75, 0.25] [1].

To continue talking to Dosu, mention @dosu.

xidian237 commented 2 months ago

@dosu, This is my code:

metrics = [globals()[metric_name] for metric_name in [metric_name]] result = evaluate(dataset=dataset, metrics=metrics, llm=gen_ai_hub_llm, embeddings=gen_ai_hub_embedding_model, raise_exceptions=True) print(result)

I didn't use the way you mentioned.

dosubot[bot] commented 2 months ago

To resolve the AssertionError: "AnswerSimilarity must be set" in your code, you need to ensure that the AnswerCorrectness metric is properly initialized with the AnswerSimilarity object. This can be done by calling the init method with the appropriate RunConfig object before using the evaluate function.

Here is how you can modify your code to include the initialization step:

from ragas.run_config import RunConfig
from ragas.metrics._answer_correctness import AnswerCorrectness

# Initialize the RunConfig with your LLM and embeddings
run_config = RunConfig(llm=gen_ai_hub_llm, embeddings=gen_ai_hub_embedding_model)

# Initialize the AnswerCorrectness metric
answer_correctness_metric = AnswerCorrectness()
answer_correctness_metric.init(run_config)

# Use the initialized metric in the evaluate function
metrics = [answer_correctness_metric]
result = evaluate(dataset=dataset, metrics=metrics, llm=gen_ai_hub_llm,
                  embeddings=gen_ai_hub_embedding_model, raise_exceptions=True)
print(result)

This ensures that the AnswerSimilarity object is properly set in the AnswerCorrectness metric, avoiding the AssertionError [1].

To continue talking to Dosu, mention @dosu.