Open Senthselvi opened 5 days ago
To resolve the issue where "Faithfulness is not callable" in the Ragas evaluation, you need to ensure that you are instantiating the Faithfulness
class correctly. The error suggests that you might be trying to call an instance of Faithfulness
as if it were a function.
Here is the correct way to instantiate and use the Faithfulness
class:
faithfulness = Faithfulness()
Make sure you are not trying to call faithfulness
as a function. For example, if you have something like this:
score = faithfulness()
You should change it to:
# Assuming you have a sample and callbacks defined
score = await faithfulness._single_turn_ascore(sample, callbacks)
Ensure that you are using the methods provided by the Faithfulness
class correctly [1].
To continue talking to Dosu, mention @dosu.
Ok got it
why is Empty DataFrame is returned? llm, embed_model = init_llm_ollama(base_url, bearer_token) def evaluate( dataset: List[LLMTestCase], metrics: List[Callable], llm: llm, embeddings: embed_model, raise_exceptions: bool = True, ) -> pd.DataFrame: results = []
# Convert results to DataFrame
results_df = pd.DataFrame(results)
return results_df
data = { 'question': ['When was the first super bowl?', 'Who won the most super bowls?'], 'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'], 'contexts' : [['The First AFL–NFL World Championship Game was an American football game played on January 15, 1967, at the Los Angeles Memorial Coliseum in Los Angeles,'], ['The Green Bay Packers...Green Bay, Wisconsin.','The Packers compete...Football Conference']], 'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times'] } data_samples = Dataset.from_dict(data)
test_cases = [ LLMTestCase( question=question, answer=answer, contexts=context, ground_truth=ground_truth ) for question, answer, ground_truth, context in zip( data_samples['question'], data_samples['answer'], data_samples['ground_truth'], data_samples['contexts'] ) ]
metrics = [ answer_relevancy,
# faithfulness,
# context_recall
]
async def main(): results_df = evaluate( dataset=test_cases, metrics=[Faithfulness], llm=llm, embeddings=embed_model, raise_exceptions=False )
print(results_df)
results_df.to_csv(r'C:\TestingBot\Scripts\V1\myfile.csv', sep=',')
asyncio.run(main())
An Empty DataFrame is returned when evaluating the dataset using the evaluate
function because the dataset specified by dataset_name
does not exist in LangSmith. The function raises a ValueError
if the dataset is not found, which indicates that the dataset must be present in LangSmith for the evaluation to proceed [1].
Additionally, the evaluate
function in the Ragas framework requires the dataset to be in a specific format with columns such as question
, contexts
, answer
, and ground_truth
. If the dataset does not conform to this format or if the required columns are missing, the function will raise a ValueError
, leading to an empty or non-existent result [2].
Ensure that your dataset is correctly formatted and exists in LangSmith. Also, make sure you have set the necessary environment variables, such as OPENAI_API_KEY
, if you are using metrics that require access to external APIs like OpenAI.
To continue talking to Dosu, mention @dosu.
Do ragas support llama3.1:8b model, getting NAN output
Common reasons for getting NaN outputs in the Ragas evaluation framework include:
raise_exceptions
is set to False
, the evaluation will return np.nan
for the row that failed.Regarding support for the llama3.1:8b model, I couldn't find specific information in the repository. However, you can refer to the example code for integrating LlamaIndex with Ragas, which might help you set up and evaluate your model correctly [2].
To continue talking to Dosu, mention @dosu.
Python version - 3.11 Ragas- 0.1.16
share me the correct evaluation metrics
The correct evaluation metrics to use with Ragas version 0.1.16 in Python 3.11 are:
context_precision
context_recall
faithfulness
answer_relevancy
context_utilization
context_entity_recall
noise_sensitivity
summarization_score
answer_semantic_similarity
answer_correctness
You can import and use these metrics as follows:
from ragas.metrics import (
answer_relevancy,
faithfulness,
context_recall,
context_precision,
context_utilization,
context_entity_recall,
noise_sensitivity,
summarization_score,
answer_semantic_similarity,
answer_correctness,
)
These metrics measure various aspects of your Retrieval Augmented Generation (RAG) systems:
To run the evaluation on your dataset with these metrics, you can use the evaluate
function provided by Ragas:
from ragas import evaluate
from datasets import Dataset
# Example dataset
dataset = Dataset.from_dict({
'question': ["What is the capital of France?"],
'contexts': [["Paris is the capital of France."]],
'answer': ["Paris"],
'ground_truth': [["Paris"]]
})
# Run evaluation
result = evaluate(dataset)
print(result)
This will give you a comprehensive view of your RAG system's performance based on the specified metrics [1][2][3].
To continue talking to Dosu, mention @dosu.
error while evaluating verify that the data type in your code aligns with the expected pyarrow data types.
To ensure that your data types align with the expected pyarrow data types in the Ragas evaluation framework, verify that your dataset columns have the following types:
question
: list<string>
contexts
: list<list<string>>
answer
: list<string>
ground_truth
: list<list<string>>
Here's an example of how to prepare your dataset with the correct data types:
from datasets import Dataset
# Example dataset
data = {
'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'],
'contexts': [['The First AFL–NFL World Championship Game was an American football game played on January 15, 1967, at the Los Angeles Memorial Coliseum in Los Angeles,'],
['The Green Bay Packers...Green Bay, Wisconsin.', 'The Packers compete...Football Conference']],
'ground_truth': [['The first superbowl was held on January 15, 1967'], ['The New England Patriots have won the Super Bowl a record six times']]
}
# Convert to Hugging Face Dataset
dataset = Dataset.from_dict(data)
# Verify the data types
print(dataset)
# Example usage with Ragas evaluation
from ragas import evaluate
from ragas.metrics import (
answer_relevancy,
faithfulness,
context_recall,
context_precision,
context_utilization,
context_entity_recall,
noise_sensitivity,
summarization_score,
answer_semantic_similarity,
answer_correctness,
)
# List of metrics
metrics = [
answer_relevancy,
faithfulness,
context_recall,
context_precision,
context_utilization,
context_entity_recall,
noise_sensitivity,
summarization_score,
answer_semantic_similarity,
answer_correctness,
]
# Run evaluation
result = evaluate(dataset, metrics=metrics)
print(result)
Ensure that your dataset conforms to these data types to avoid errors during evaluation [1][2][3].
To continue talking to Dosu, mention @dosu.
'dict' object has no attribute 'rename_columns'
You are getting the error "'dict' object has no attribute 'rename_columns'" because the rename_columns
method is being called on a dictionary object instead of a Dataset
object. The rename_columns
method is a method of the Dataset
class from the datasets
library, not a dictionary.
In the provided code, the rename_columns
method is used in the convert_v1_to_v2_dataset
and convert_v2_to_v1_dataset
functions, which expect a Dataset
object as their input. If you pass a dictionary to these functions, you will encounter this error.
Here is the relevant code snippet:
def convert_v1_to_v2_dataset(dataset: Dataset) -> Dataset:
columns_map = {v: k for k, v in REQUIRED_COLS_v1.items() if v in dataset.features}
return dataset.rename_columns(columns_map)
def convert_v2_to_v1_dataset(dataset: Dataset) -> Dataset:
columns_map = {k: v for k, v in REQUIRED_COLS_v1.items() if k in dataset.features}
return dataset.rename_columns(columns_map)
Additionally, you can refer to the remap_column_names
function from the ragas
library, which also demonstrates how to remap column names in a Dataset
object:
def remap_column_names(dataset: Dataset, column_map: dict[str, str]) -> Dataset:
"""
Remap the column names in case dataset uses different column names
"""
inverse_column_map = {v: k for k, v in column_map.items()}
return dataset.rename_columns(inverse_column_map)
Make sure that you are passing a Dataset
object to these functions, not a dictionary [1][2].
To continue talking to Dosu, mention @dosu.
AttributeError: 'LLMTestCase' object has no attribute 'items' data = LLMTestCase( question=query, answer=actual_output, ground_truth="Vector databases store and provide access to unstructured data", contexts=retrieval_context, )
The AttributeError: 'LLMTestCase' object has no attribute 'items'
suggests that somewhere in your code, an LLMTestCase
object is being treated as a dictionary. This typically happens when a method or function expects a dictionary but receives an LLMTestCase
object instead.
To resolve this, ensure that you are correctly using the LLMTestCase
class and not mistakenly treating it as a dictionary. Here is an example of how to correctly instantiate and use an LLMTestCase
object:
# Assuming LLMTestCase is defined correctly
data = LLMTestCase(
question=query,
answer=actual_output,
ground_truth="Vector databases store and provide access to unstructured data",
contexts=retrieval_context,
)
# Example usage
print(data.question) # Accessing attributes of LLMTestCase
print(data.answer)
print(data.ground_truth)
print(data.contexts)
If you need to convert an LLMTestCase
object to a dictionary for some reason, you can manually create the dictionary:
data_dict = {
'question': data.question,
'answer': data.answer,
'ground_truth': data.ground_truth,
'contexts': data.contexts,
}
# Now you can use data_dict as a dictionary
Ensure that any function or method expecting a dictionary receives data_dict
instead of the LLMTestCase
object directly. This should resolve the AttributeError
.
To continue talking to Dosu, mention @dosu.
ValueError: Dataset feature "contexts" should be of type Sequence[string], got <class 'datasets.features.features.Sequence'> data = { 'question': [query], # List with a single question 'answer': [actual_output], # List with a single answer 'contexts': [retrieval_context], # List with a single list of contexts (retrieval_context should be a list of strings) 'ground_truth': [ "Vector databases store and provide access to unstructured data, such as text or images, in the form of their vector embeddings. " "Because similar objects are close together in vector space, the similarity of data objects can be calculated based on the distance between the data object’s vector embeddings." ] # List with a single ground truth }
To resolve the ValueError: "Dataset feature 'contexts' should be of type Sequence[string], got <class 'datasets.features.features.Sequence>'"
, you need to ensure that the contexts
field is correctly specified as a sequence of strings. Here is how you can adjust your dataset creation to meet the expected format:
retrieval_context
is a list of strings.Here is the corrected code snippet:
from datasets import Dataset, Features, Sequence, Value
# Ensure retrieval_context is a list of strings
retrieval_context = [
"Context 1",
"Context 2",
# Add more contexts as needed
]
# Define your data
data = {
'question': [query], # List with a single question
'answer': [actual_output], # List with a single answer
'contexts': [retrieval_context], # List with a single list of contexts
'ground_truth': [
"Vector databases store and provide access to unstructured data, such as text or images, in the form of their vector embeddings. "
"Because similar objects are close together in vector space, the similarity of data objects can be calculated based on the distance between the data object’s vector embeddings."
] # List with a single ground truth
}
# Define the features explicitly
features = Features({
'question': Sequence(Value('string')),
'answer': Sequence(Value('string')),
'contexts': Sequence(Sequence(Value('string'))),
'ground_truth': Sequence(Value('string'))
})
# Create the dataset
dataset = Dataset.from_dict(data, features=features)
# Verify the dataset
print(dataset)
This code ensures that the contexts
field is recognized as a sequence of strings, which should resolve the ValueError
[1][2][3].
To continue talking to Dosu, mention @dosu.
KeyError('summary') Exception raised in Job[2]: TypeError(CustomOllama.agenerate_text() got an unexpected keyword argument 'n')
You are encountering a KeyError('summary')
and a TypeError
stating that CustomOllama.agenerate_text() got an unexpected keyword argument 'n'
.
KeyError('summary'):
TypeError(CustomOllama.agenerate_text() got an unexpected keyword argument 'n'):
CustomOllama
class's agenerate_text
method does not support the n
argument. Update the method signature to include the n
argument:async def agenerate_text(
self,
prompt: PromptValue,
n: int = 1,
temperature: t.Optional[float] = None,
stop: t.Optional[t.List[str]] = None,
callbacks: t.List = [],
) -> LLMResult:
# Your implementation here
This should resolve the issues you are facing [3][4].
To continue talking to Dosu, mention @dosu.
AttributeError('CustomOllama' object has no attribute 'model_name')
The AttributeError: 'CustomOllama' object has no attribute 'model_name'
indicates that the CustomOllama
class is missing the model_name
attribute. To resolve this, you need to ensure that the CustomOllama
class includes the model_name
attribute.
Here is an example of how you might define the CustomOllama
class with the model_name
attribute:
class CustomOllama:
def __init__(self, model_name: str, other_params: dict):
self.model_name = model_name
# Initialize other attributes as needed
for key, value in other_params.items():
setattr(self, key, value)
async def agenerate_text(
self,
prompt: str,
n: int = 1,
temperature: float = 1e-8,
stop: t.Optional[t.List[str]] = None,
callbacks: t.List = [],
) -> LLMResult:
# Your implementation here
pass
# Add other methods as needed
Ensure that when you instantiate the CustomOllama
class, you provide the model_name
:
ollama = CustomOllama(model_name="llama3.1:8b", other_params={})
This should resolve the AttributeError
by ensuring that the model_name
attribute is properly defined and accessible in the CustomOllama
class.
To continue talking to Dosu, mention @dosu.
Faithfulness is not callable