Open AnujLahoty opened 3 months ago
To resolve the "cannot reuse already awaited coroutine" error when integrating RAGAS with llama_index
using a SageMaker endpoint as a language model, you need to ensure that each coroutine is awaited only once. This error typically occurs when a coroutine is awaited multiple times, which is not allowed in Python.
In your provided code, the asynchronous methods achat
, astream_chat
, acomplete
, and astream_complete
are not implemented and raise NotImplementedError
. You need to implement these methods properly.
Here is an example of how you might implement the acomplete
method to avoid the "cannot reuse already awaited coroutine" error:
@llm_completion_callback()
async def acomplete(
self, prompt: str, formatted: bool = False, **kwargs: Any
) -> CompletionResponse:
model_kwargs = {**self.model_kwargs, **kwargs}
if not formatted:
prompt = self._completion_to_prompt(prompt, self.system_prompt)
request_body = self.content_handler.serialize_input(prompt, model_kwargs)
response = await self._client.invoke_endpoint_async(
EndpointName=self.endpoint_name,
Body=request_body,
ContentType=self.content_handler.content_type,
Accept=self.content_handler.accept,
**self.endpoint_kwargs,
)
response["Body"] = self.content_handler.deserialize_output(response["Body"])
text = self.content_handler.remove_prefix(response["Body"], prompt)
return CompletionResponse(
text=text,
raw=response,
additional_kwargs={
"model_kwargs": model_kwargs,
"endpoint_kwargs": self.endpoint_kwargs,
},
)
In this implementation, invoke_endpoint_async
is a hypothetical asynchronous method that you would need to implement or use if available in your AWS client library. This ensures that the coroutine is awaited properly and only once.
Make sure to implement similar changes for other asynchronous methods if you plan to use them.
Additionally, you can refer to the SageMakerLLM
class in the llama_index
library for more context on how to set up and use the SageMaker endpoint. Here is an example setup:
from llama_index.llms.sagemaker import SageMakerLLM
# hooks for HuggingFaceH4/zephyr-7b-beta
def messages_to_prompt(messages):
prompt = ""
for message in messages:
if message.role == 'system':
prompt += f"<|system|>\n{message.content}</s>\n"
elif message.role == 'user':
prompt += f"<|user|>\n{message.content}</s>\n"
elif message.role == 'assistant':
prompt += f"<|assistant|>\n{message.content}</s>\n"
if not prompt.startswith("<|system|>\n"):
prompt = "<|system|>\n</s>\n" + prompt
prompt = prompt + "<|assistant|>\n"
return prompt
def completion_to_prompt(completion):
return f"<|system|>\n</s>\n<|user|>\n{completion}</s>\n<|assistant|>\n"
model_name = "HuggingFaceH4/zephyr-7b-beta"
api_key = "your_api_key"
region = "your_region"
llm = SageMakerLLM(
model_name=model_name,
api_key=api_key,
region=region,
messages_to_prompt=messages_to_prompt,
completion_to_prompt=completion_to_prompt,
)
To further ensure that coroutines are managed correctly, you can use utility functions from the llama_index
library, such as run_async_tasks
and batch_gather
, which help in running multiple asynchronous tasks efficiently:
from llama_index.core.async_utils import run_async_tasks, batch_gather
# Example usage of run_async_tasks
tasks = [some_async_function() for _ in range(10)]
results = run_async_tasks(tasks, show_progress=True)
# Example usage of batch_gather
batch_results = await batch_gather(tasks, batch_size=5, verbose=True)
These utilities help manage the execution of multiple coroutines, ensuring they are awaited properly and avoiding the "cannot reuse already awaited coroutine" error [1][2][3].
To continue talking to Dosu, mention @dosu.
@logan-markewich Can you please check it mate?
Is the faithfulness evaluator from ragas? From llamaindex? You left out the imports, as well as the actual stack trace after applying nest asyncio
@logan-markewich My apologies. I will ping them here ASAP. But on a very hgh level I am loading my LLM from Sagemaker endpoint using ;;lamindex's Sagemaker end point module. Then I am running ragas evaluation metrics. It says that acomplete function is not implemented for it.
Yea the async endpoints for sagemaker LLM are not implemented. Someone would have to contribute that in a PR
Bug Description
So I want to get the metrics from RAGAS and whenever I dowwnload model locally in my EC2 the metrics gives me the answer biut when I use sagemaker end point as a LLM it thros error. The error says that it is asyncio error and I need to do nest.apply(). I laready did it but the actual error message is cannot reuse already awaited coroutine
Version
0.10.55
Steps to Reproduce
Here i the code that can be run to priduce the error message
Relevant Logs/Tracbacks
No response