Open amin-kh96 opened 1 month ago
The error you're encountering indicates that the CustomRagasLLM
class is missing implementations for the abstract methods generate_text
and agenerate_text
from the BaseRagasLLM
class. However, your code does include these methods, so the issue might be related to the method signatures not matching the expected ones in the BaseRagasLLM
class.
Here's what you need to do:
Ensure Method Signatures Match: The method signatures in your subclass must match those defined in the BaseRagasLLM
class. Here are the correct signatures based on the BaseRagasLLM
class:
def generate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: float = 1e-8,
stop: t.Optional[t.List[str]] = None,
callbacks: Callbacks = None,
) -> LLMResult:
# Your implementation here
async def agenerate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: t.Optional[float] = None,
stop: t.Optional[t.List[str]] = None,
callbacks: Callbacks = None,
) -> LLMResult:
# Your implementation here
Adjust Your Implementations: Update your generate_text
and agenerate_text
methods to match these signatures. You will need to adjust the logic inside these methods to handle the additional parameters and return an LLMResult
object.
Here's an example of how you might adjust your generate_text
method:
def generate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: float = 1e-8,
stop: t.Optional[t.List[str]] = None,
callbacks: Callbacks = None,
) -> LLMResult:
# Example implementation
text = self._call(prompt)
return LLMResult(generations=[Generation(text=text)])
And for agenerate_text
:
async def agenerate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: t.Optional[float] = None,
stop: t.Optional[t.List[str]] = None,
callbacks: Callbacks = None,
) -> LLMResult:
# Example implementation
text = self._call(prompt)
return LLMResult(generations=[Generation(text=text)])
Make sure to replace PromptValue
, LLMResult
, and Generation
with the appropriate types or classes used in your project. This should resolve the error and allow you to instantiate your CustomRagasLLM
class.
To continue talking to Dosu, mention @dosu.
I tell you one thing, I do this to bypass the requirements of the LLM because already have them. the embeddings of context, query, and the answer, creating this function is only for bypassing the requirements of the API key and the openAI API KEY. is your answer still correct for this situation? @dosu.
@dosu.
I tell you one thing, I do this to bypass the requirements of the LLM because already have them. the embeddings of context, query, and the answer, creating this function is only for bypassing the requirements of the API key and the openAI API KEY. is your answer still correct for this situation?
@dosu. I tell you one thing, I do this to bypass the requirements of the LLM because already have them. the embeddings of context, query, and the answer, creating this function is only for bypassing the requirements of the API key and the openAI API KEY. is your answer still correct for this situation?
Yes, the solution I provided is still applicable to your situation. Even if you are bypassing the API key requirements, the CustomRagasLLM
class must still implement the abstract methods generate_text
and agenerate_text
with the correct method signatures as defined in the BaseRagasLLM
class. This is necessary to satisfy the requirements of the abstract base class and avoid the instantiation error.
By implementing these methods with the correct signatures, you ensure that your subclass conforms to the expected interface, regardless of whether you are using an API key or not. You can then customize the logic within these methods to suit your needs, such as using precomputed embeddings or other local resources instead of making API calls.
To continue talking to Dosu, mention @dosu.
@dosu. what do you mean by correct signatures? also, I don't understand this part: Make sure to replace PromptValue, LLMResult, and Generation with the appropriate types or classes used in your project. This should resolve the error and allow you to instantiate your CustomRagasLLM class.
By "correct signatures," I mean that the method definitions in your subclass should exactly match the method definitions in the BaseRagasLLM
class. This includes the method name, parameters, and return type. Here are the correct method signatures for generate_text
and agenerate_text
in the BaseRagasLLM
class:
def generate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: float = 1e-8,
stop: t.Optional[t.List[str]] = None,
callbacks: t.List = [],
) -> LLMResult
async def agenerate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: float = 1e-8,
stop: t.Optional[t.List[str]] = None,
callbacks: t.List = [],
) -> LLMResult
Regarding the part about replacing PromptValue
, LLMResult
, and Generation
with the appropriate types or classes:
PromptValue
is a type used for the prompt
parameter. It should be replaced with whatever type or class you are using for prompts in your project.LLMResult
is the expected return type of these methods. It should be replaced with the actual class or type you are using to represent the result of a language model generation.Generation
is a class used within LLMResult
to represent individual text generations.In your implementation, ensure that you are using the correct types or classes that are defined in your project or the libraries you are using. This will help you avoid type-related errors and ensure that your subclass is correctly implementing the abstract methods.
To continue talking to Dosu, mention @dosu.
@dosu. I do not know why I create a subclass i face the error of Can't instantiate abstract class CustomRagasLLM without an implementation for abstract method 'agenerate_text'. but I used it in my function. Can you explain to me why I face instantiate errors?
The error occurs because your subclass CustomRagasLLM
must implement the abstract method agenerate_text
from the BaseRagasLLM
class. Even if you've defined a method with the same name, its signature must exactly match the one in the base class. Here's the correct signature you need to use:
async def agenerate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: float = 1e-8,
stop: t.Optional[t.List[str]] = None,
callbacks: t.List = [],
) -> LLMResult:
# Your implementation here
Ensure that your method matches this signature, including all parameters and return type. This will resolve the instantiation error [1].
To continue talking to Dosu, mention @dosu.
@dosu. My Python skills are not at an expert level. I think I need to use 'await' in the async function you gave me above. can you explain it to me, please?
To use await
in an async function, you need to call asynchronous functions with it. This allows the function to pause and wait for the result of the asynchronous operation without blocking the entire program. Here's how you can use await
in the agenerate_text
method:
async def agenerate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: float = 1e-8,
stop: t.Optional[t.List[str]] = None,
callbacks: t.List = [],
) -> LLMResult:
# Example of using await with an asynchronous call
result = await some_async_function(prompt)
return LLMResult(generations=[Generation(text=result)])
In this example, some_async_function
is a placeholder for any asynchronous function you might be calling. Using await
ensures that the function waits for some_async_function
to complete before proceeding. This is essential for handling asynchronous operations properly in Python [1][2].
To continue talking to Dosu, mention @dosu.
@dosu. can you give me these functions an easier way I mean without using the async function, my python skill is not so good at this moment. these are the codes: def generate_text( self, prompt: PromptValue, n: int = 1, temperature: float = 1e-8, stop: t.Optional[t.List[str]] = None, callbacks: t.List = [], ) -> LLMResult async def agenerate_text( self, prompt: PromptValue, n: int = 1, temperature: float = 1e-8, stop: t.Optional[t.List[str]] = None, callbacks: t.List = [], ) -> LLMResult
To implement the generate_text
function without using async, you can keep it synchronous. Here's how you can do it:
def generate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: float = 1e-8,
stop: t.Optional[t.List[str]] = None,
callbacks: t.List = [],
) -> LLMResult:
# Example implementation
text = self._call(prompt)
return LLMResult(generations=[Generation(text=text)])
For the agenerate_text
function, if you want to avoid async, you can simply call the synchronous generate_text
method:
def agenerate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: float = 1e-8,
stop: t.Optional[t.List[str]] = None,
callbacks: t.List = [],
) -> LLMResult:
# Call the synchronous method
return self.generate_text(prompt, n, temperature, stop, callbacks)
This way, you avoid using async while still implementing the required methods [1].
To continue talking to Dosu, mention @dosu.
@dosu. please check my function if there is any error or mistake help me to solve it. my code:
class CustomRagasLLM(BaseRagasLLM): def init(self, api_key: str = None): """ Initialize the custom LLM, optionally using an API key if necessary. """ self.api_key = api_key
def _call(self, prompt: str) -> str:
"""
Process the prompt and return a result. This can be customized to
use a local model or perform any required logic.
"""
if not self.api_key:
return f"Processed: {prompt} (without API key)"
else:
# Handle LLM response if using an API
return f"Processed: {prompt} (with API key: {self.api_key})"
def generate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: float = 1e-8,
stop: t.Optional[t.List[str]] = None,
callbacks: t.List = [],
) -> LLMResult: text = self._call(prompt) return LLMResult(generations=[Generation(text=text)])
def agenerate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: float = 1e-8,
stop: t.Optional[t.List[str]] = None,
callbacks: t.List = [],
) -> LLMResult:
return self.generate_text(prompt, n, temperature, stop, callbacks)
custom_llm =CustomRagasLLM(api_key=None)
Your CustomRagasLLM
class implementation looks mostly correct, but there are a few adjustments needed:
Async Method: The agenerate_text
method should be defined as async
if you want to keep it asynchronous, even if it calls a synchronous method. If you prefer to avoid async, you can keep it as is, but it should not be labeled as an async method.
Return Types: Ensure that LLMResult
and Generation
are correctly imported and used. They should match the expected types in your project.
Here's a revised version of your class:
from ragas.llms import BaseRagasLLM
from ragas.llms.prompt import PromptValue
from langchain_core.outputs import LLMResult, Generation
import typing as t
class CustomRagasLLM(BaseRagasLLM):
def __init__(self, api_key: str = None):
self.api_key = api_key
def _call(self, prompt: str) -> str:
if not self.api_key:
return f"Processed: {prompt} (without API key)"
else:
return f"Processed: {prompt} (with API key: {self.api_key})"
def generate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: float = 1e-8,
stop: t.Optional[t.List[str]] = None,
callbacks: t.List = [],
) -> LLMResult:
text = self._call(prompt)
return LLMResult(generations=[Generation(text=text)])
def agenerate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: float = 1e-8,
stop: t.Optional[t.List[str]] = None,
callbacks: t.List = [],
) -> LLMResult:
return self.generate_text(prompt, n, temperature, stop, callbacks)
custom_llm = CustomRagasLLM(api_key=None)
Make sure you have the correct imports for PromptValue
, LLMResult
, and Generation
[1].
To continue talking to Dosu, mention @dosu.
@dosu. my dataset is in this format: [{'question': [...], 'answer': [...], 'contexts': [...]}, {'question': [...], 'answer':[...],contexts': [...]},...] the value of each key is a list of float numbers (computed embeddings). I already created subclasses of baseragasembeddings and baseragasLLM, but I get this error now:
xception has occurred: AttributeError
'list' object has no attribute 'get_sample_type'
File "C:\Users\Amin\OneDrive - unige.it\Documenti\projectss\ragas-prototype\src\rag_eva.py", line 215, in
The error you're encountering is because the evaluate
function expects the dataset to be in a specific format, but it seems like you're passing a list of dictionaries instead of the required format.
The evaluate
function in the Ragas library expects the dataset to be a dictionary with the following structure:
dataset: dict[question: list[str], ground_truth: list[list[str]]]
Here's how you can adjust your dataset to match the expected format:
Convert Your Dataset: Transform your list of dictionaries into the required dictionary format. You need to extract the question
and ground_truth
(which might be your contexts
or answer
) from your data.
Example Conversion:
ragas_data = {
"question": [entry["question"] for entry in your_data],
"ground_truth": [entry["contexts"] for entry in your_data] # or "answer" if that's your ground truth
}
ragas_data
when calling the evaluate
function.This should resolve the AttributeError
by ensuring the dataset is in the correct format for the evaluate
function [1][2].
To continue talking to Dosu, mention @dosu.
@dosu. but my dataset is in this format I give you the code and take a look,
ragas_data = [ { "question": entry["text_vector_1"], # Assuming this is a list of strings "answer": entry["text_vector_2"], # Assuming this is a list of strings "contexts": entry["chunks"] # Assuming this is a list of lists of strings } for entry in data_set ]
formatted_data = { "question": [entry["question"] for entry in ragas_data], "contexts": [entry["contexts"] for entry in ragas_data], "answer": [entry["answer"] for entry in ragas_data] }
ragas_dataset = Dataset.from_dict(formatted_data)
To resolve the 'list' object has no attribute 'get_sample_type'
error, you need to ensure that your dataset is an instance of EvaluationDataset
, not just a list or a Hugging Face Dataset
. Here's how you can convert your data into an EvaluationDataset
:
EvaluationDataset.from_list
: Convert your list of dictionaries into an EvaluationDataset
.Here's how you can do it:
from your_module import EvaluationDataset # Replace with the actual module name
# Assuming data_set is your list of dictionaries
ragas_data = [
{
"question": entry["text_vector_1"],
"answer": entry["text_vector_2"],
"contexts": entry["chunks"]
}
for entry in data_set
]
# Create an EvaluationDataset instance
evaluation_dataset = EvaluationDataset.from_list(ragas_data)
# Now you can use this dataset with the evaluate function
evaluation_report = evaluate(evaluation_dataset, metrics=metrics, embeddings=ragas_embeddings, llm=custom_llm)
Make sure to replace your_module
with the actual module where EvaluationDataset
is defined. This should resolve the error by ensuring the dataset is in the correct format for the evaluate
function [1].
To continue talking to Dosu, mention @dosu.
I wrote this code and I am using a subclass of LLM and BASERAGASEMBEDDINGS. I already have the embeddings and I face the error to set the API key.to solve that since I have all the requirements I created a sub class of baseragasLLM, but I face this error, and the thing is that I already defined what it needed in my subclass, please take a look at my code and my error and help me to fix it.
the code: import json from transformers import AutoModel, AutoTokenizer import torch import numpy as np from datasets import Dataset from ragas.embeddings import BaseRagasEmbeddings from ragas.metrics import context_utilization,ContextUtilization from ragas.llms import BaseRagasLLM import asyncio from ragas import evaluate
Load the ground truth data
file_path = 'C:\Users\Amin\OneDrive - unige.it\Documenti\projectss\ragas-prototype\src\assets\GT.json' with open(file_path) as f: ground_truth_data = json.load(f)
Load the question and the answer and the chunks
file_path = 'C:\Users\Amin\OneDrive - unige.it\Documenti\projectss\ragas-prototype\src\assets\user_llm_interaction_embeddings_c1521dd5_b819_4241_b3a4_3e5c1388037c.json' with open(file_path) as f: llm = json.load(f)
Initialize an empty list to hold the new dataset
data_set = []
Iterate through the list and combine every two dictionaries
for i in range(0, len(llm), 2): combined_dict = { "text_vector_1": llm[i].get("text_vector", []), "text_vector_2": llm[i + 1].get("text_vector", []), 'chunks': llm[i + 1].get('chunks', []) } data_set.append(combined_dict)
def map_chunks(data_set, ground_truth_data): for item in data_set: # Iterate over each dictionary in data_set c = [] # Reset c for each item for chunk_id in item['chunks']: # Loop through 'chunks' in the current dictionary for element in ground_truth_data: # Loop through ground_truth_data if element['id'] == chunk_id: # Match chunk_id with element's id c.append(element['text_vector']) # Append the matching text_vector to c item['chunks'] = c # Replace the original 'chunks' (ids) with the mapped text_vector values
data_set = map_chunks(data_set, ground_truth_data)
Assuming data_set is a list of dictionaries
ragas_data = [ { "question": entry["text_vector_1"], # Assuming this is a list of strings "answer": entry["text_vector_2"], # Assuming this is a list of strings "contexts": entry["chunks"] # Assuming this is a list of lists of strings } for entry in data_set ]
Create the required structure for Dataset
formatted_data = { "question": [entry["question"] for entry in ragas_data], "contexts": [entry["contexts"] for entry in ragas_data], "answer": [entry["answer"] for entry in ragas_data] }
model_name = 'distilbert-base-uncased'
class CustomHuggingFaceRagasEmbeddings(BaseRagasEmbeddings): def init(self, model_name: str, custom_embeddings: list = None): """ Initialize the Custom Hugging Face Ragas Embeddings with the specified model and custom embeddings.
Initialize the custom embeddings class
custom_embeddings = CustomHuggingFaceRagasEmbeddings(ragas_data)
ragas_embeddings = CustomHuggingFaceRagasEmbeddings(model_name=model_name, custom_embeddings=custom_embeddings)
Define the custom LLM class
class CustomRagasLLM(BaseRagasLLM): def init(self, api_key: str = None): """ Initialize the custom LLM, optionally using an API key if necessary. """ self.api_key = api_key
Initialize the custom LLM class (set api_key=None if you don't need it)
custom_llm = CustomRagasLLM(api_key=None)
Define the evaluation metrics
metrics = [context_utilization]
lets define a custom function of evaluate function
def custom_evaluate(ragas_data, metrics, embeddings: BaseRagasEmbeddings):
""" Custom evaluation function that avoids using OpenAI API.
results = {}
return results
Run the evaluation
evaluation_report = evaluate(ragas_data, metrics=metrics, embeddings=ragas_embeddings,llm=custom_llm) # Pass your custom LLM here
Print the evaluation results
print("RAGAS Evaluation Report:") print(evaluation_report)
the error: Exception has occurred: TypeError Can't instantiate abstract class CustomRagasLLM without an implementation for abstract methods 'agenerate_text', 'generate_text' File "C:\Users\Amin\OneDrive - unige.it\Documenti\projectss\ragas-prototype\src\rag_eva.py", line 158, in
custom_llm = CustomRagasLLM(api_key=None)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Can't instantiate abstract class CustomRagasLLM without an implementation for abstract methods 'agenerate_text', 'generate_text'