tombabycome commented 2 months ago

My code is

from datasets import Dataset from ragas.llms import LangchainLLMWrapper

from langchain_community.embeddings import SparkLLMTextEmbeddings, HuggingFaceEmbeddings

from langchain_community.chat_models import ChatSparkLLM

from sparkai.llm.llm import ChatSparkLLM

SPARKAI_URL = 'wss://spark-api.xf-yun.com/v3.5/chat' SPARKAI_APP_ID = '' SPARKAI_API_SECRET = '' SPARKAI_API_KEY = '' SPARKAI_DOMAIN = 'generalv3.5' spark = ChatSparkLLM( spark_app_id=SPARKAI_APP_ID, spark_api_key=SPARKAI_API_KEY, spark_api_secret=SPARKAI_API_SECRET, spark_api_url=SPARKAI_URL, spark_llm_domain="generalv3.5", ) v_llm = LangchainLLMWrapper(spark)

model_name = "all-MiniLM-L6-v2" embeddings = HuggingFaceEmbeddings( model_name=model_name, ) from ragas import evaluate from ragas.metrics import ( answer_relevancy, faithfulness, context_recall, context_precision, ) data_samples = { 'question': ['When was the first super bowl?', 'Who won the most super bowls?'], 'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'], 'contexts' : [['The First AFL–NFL World Championship Game was an American football game played on January 15, 1967, at the Los Angeles Memorial Coliseum in Los Angeles,'], ['The Green Bay Packers...Green Bay, Wisconsin.','The Packers compete...Football Conference']], 'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times'] } dataset = Dataset.from_dict(data_samples)

重新一键式测评

result = evaluate( dataset, llm=v_llm, embeddings=embeddings, metrics=[ context_precision, faithfulness, answer_relevancy, context_recall, ], ) df = result.to_pandas() df.head() print(df) df.to_csv('output.csv', index=False) # 将 DataFrame 保存为 CSV 文件，不包括索引列

But the result is wrong,and the evaluation is all Nan.the issue is flowing:

Evaluating: 0%| | 0/8 [00:00<?, ?it/s]Exception raised in Job[2]: ValueError(Got unknown type content='Generate a question for the given answer and Identify if answer is noncommittal. Give noncommittal as 1 if the answer is noncommittal and 0 if the answer is committal. A noncommittal answer is one that is evasive, vague, or ambiguous. For example, "I don\'t know" or "I\'m not sure" are noncommittal answers\n\nThe output should be a well-formatted JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output JSON schema:\n\n{"type": "object", "properties": {"question": {"title": "Question", "type": "string"}, "noncommittal": {"title": "Noncommittal", "type": "integer"}}, "required": ["question", "noncommittal"]}\n\n\nDo not return any preamble or explanations, return only a pure JSON string surrounded by triple backticks ().\n\nExamples:\n\nanswer: "Albert Einstein was born in Germany."\ncontext: "Albert Einstein was a German-born theoretical physicist who is widely held to be one of the greatest and most influential scientists of all time"\noutput:{"question": "Where was Albert Einstein born?", "noncommittal": 0}\n\nanswer: "It can change its skin color based on the temperature of its environment."\ncontext: "A recent scientific study has discovered a new species of frog in the Amazon rainforest that has the unique ability to change its skin color based on the temperature of its environment."\noutput:{"question": "What unique ability does the newly discovered species of frog have?", "noncommittal": 0}\n\nanswer: "Everest"\ncontext: "The tallest mountain on Earth, measured from sea level, is a renowned peak located in the Himalayas."\noutput:{"question": "What is the tallest mountain on Earth?", "noncommittal": 0}\n\nanswer: "I don\'t know about the groundbreaking feature of the smartphone invented in 2023 as am unaware of information beyond 2022. "\ncontext: "In 2023, a groundbreaking invention was announced: a smartphone with a battery life of one month, revolutionizing the way people use mobile technology."\noutput:{"question": "What was the groundbreaking feature of the smartphone invented in 2023?", "noncommittal": 1}```\n\nYour actual task:\n\nanswer: "The first superbowl was held on Jan 15, 1967"\ncontext: "The First AFL\u2013NFL World Championship Game was an American football game played on January 15, 1967, at the Los Angeles Memorial Coliseum in Los Angeles,"\noutput: \n' additional_kwargs={} response_metadata={})

dosubot[bot] commented 2 months ago

Hey @tombabycome! I'm here to help you out with any issues or questions you have. Let's work together to solve this problem.

The evaluation result in your code is all NaN because the raise_exceptions parameter is set to False. When this parameter is False, the evaluation will return np.nan for any row that fails instead of raising an exception. This behavior is controlled by the wrap_callable_with_index method in the Executor class, which logs the error and returns np.nan if an exception occurs and raise_exceptions is False ^[1].

The ValueError in Job[2] is caused by validation failures due to missing or incorrectly formatted columns required for the metrics. This is indicated by the following part of the code:

Raises
------
ValueError
    if validation fails because the columns required for the metrics are missing or
    if the columns are of the wrong format.

To resolve these issues, ensure that the dataset contains all the required columns in the correct format and consider setting raise_exceptions to True to get more detailed error messages. This will help you identify the specific cause of the ValueError in Job[2] and address it accordingly ^[2].

The required columns and their formats for the metrics used in the evaluate function are determined by the required_columns attribute of each metric. The validate_required_columns function checks if the dataset contains all the necessary columns for each metric based on the sample type (either SingleTurnSample or MultiTurnSample). The specific required columns and their formats are defined within each metric's required_columns attribute, which is a dictionary mapping the metric type to a list of required columns ^[3]^[2]^[4].

The evaluate function expects the dataset to have the following columns in the specified format:

user_input: list of strings
retrieved_contexts: list of lists of strings
reference_contexts: list of lists of strings
response: list of strings
reference: list of strings
rubric: list of strings

If the column names in your dataset differ from these defaults, you can provide a mapping using the column_map parameter to align them correctly. For example, if your dataset uses contexts_v1 instead of retrieved_contexts, you can pass {"retrieved_contexts": "contexts_v1"} in the column_map ^[3].

To continue talking to Dosu, mention @dosu.

tombabycome commented 2 months ago

@Dosu,How to set raise_exceptions to True to get more detailed error messages.

tombabycome commented 2 months ago

@dosu ValueError: Got unknown type content='Given question, answer and context verify if the context was useful in arriving at the given answer. Give verdict as "1" if useful and "0" if not with json output.\n\nThe output should be a well-formatted JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output JSON schema:\n\n{"description": "Answer for the verification task wether the context was useful.", "type": "object", "properties": {"reason": {"title": "Reason", "description": "Reason for verification", "type": "string"}, "verdict": {"title": "Verdict", "description": "Binary (0/1) verdict of verification", "type": "integer"}}, "required": ["reason", "verdict"]}\n\n\nDo not return any preamble or explanations, return only a pure JSON string surrounded by triple backticks ().\n\nExamples:\n\nquestion: "What can you tell me about albert Albert Einstein?"\ncontext: "Albert Einstein (14 March 1879 – 18 April 1955) was a German-born theoretical physicist, widely held to be one of the greatest and most influential scientists of all time. Best known for developing the theory of relativity, he also made important contributions to quantum mechanics, and was thus a central figure in the revolutionary reshaping of the scientific understanding of nature that modern physics accomplished in the first decades of the twentieth century. His mass–energy equivalence formula E = mc2, which arises from relativity theory, has been called \\"the world\'s most famous equation\\". He received the 1921 Nobel Prize in Physics \\"for his services to theoretical physics, and especially for his discovery of the law of the photoelectric effect\\", a pivotal step in the development of quantum theory. His work is also known for its influence on the philosophy of science. In a 1999 poll of 130 leading physicists worldwide by the British journal Physics World, Einstein was ranked the greatest physicist of all time. His intellectual achievements and originality have made Einstein synonymous with genius."\nanswer: "Albert Einstein born in 14 March 1879 was German-born theoretical physicist, widely held to be one of the greatest and most influential scientists of all time. He received the 1921 Nobel Prize in Physics for his services to theoretical physics. He published 4 papers in 1905. Einstein moved to Switzerland in 1895"\nverification:{"reason": "The provided context was indeed useful in arriving at the given answer. The context includes key information about Albert Einstein\'s life and contributions, which are reflected in the answer.", "verdict": 1}\n\nquestion: "who won 2020 icc world cup?"\ncontext: "The 2022 ICC Men\'s T20 World Cup, held from October 16 to November 13, 2022, in Australia, was the eighth edition of the tournament. Originally scheduled for 2020, it was postponed due to the COVID-19 pandemic. England emerged victorious, defeating Pakistan by five wickets in the final to clinch their second ICC Men\'s T20 World Cup title."\nanswer: "England"\nverification:{"reason": "the context was useful in clarifying the situation regarding the 2020 ICC World Cup and indicating that England was the winner of the tournament that was intended to be held in 2020 but actually took place in 2022.", "verdict": 1}\n\nquestion: "What is the tallest mountain in the world?"\ncontext: "The Andes is the longest continental mountain range in the world, located in South America. It stretches across seven countries and features many of the highest peaks in the Western Hemisphere. The range is known for its diverse ecosystems, including the high-altitude Andean Plateau and the Amazon rainforest."\nanswer: "Mount Everest."\nverification:{"reason": "the provided context discusses the Andes mountain range, which, while impressive, does not include Mount Everest or directly relate to the question about the world\'s tallest mountain.", "verdict": 0}```\n\nYour actual task:\n\nquestion: "When was the first super bowl?"\ncontext: "The First AFL\u2013NFL World Championship Game was an American football game played on January 15, 1967, at the Los Angeles Memorial Coliseum in Los Angeles,"\nanswer: "The first superbowl was held on January 15, 1967"\nverification: \n' additional_kwargs={} response_metadata={}

dosubot[bot] commented 2 months ago

To set raise_exceptions to True in your evaluation code, you need to pass raise_exceptions=True as an argument to the evaluate function. Here is an example:

from ragas import evaluate

# Assuming `dataset` is already defined
result = evaluate(dataset, raise_exceptions=True)
print(result)

This will ensure that exceptions are raised during the evaluation, providing more detailed error messages ^[1].

To continue talking to Dosu, mention @dosu.

tombabycome commented 2 months ago

@dosu ValueError: Got unknown type content='Given question, answer and context verify if the context was useful in arriving at the given answer. Give verdict as "1" if useful and "0" if not with json output.\n\nThe output should be a well-formatted JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output JSON schema:\n\n{"description": "Answer for the verification task wether the context was useful.", "type": "object", "properties": {"reason": {"title": "Reason", "description": "Reason for verification", "type": "string"}, "verdict": {"title": "Verdict", "description": "Binary (0/1) verdict of verification", "type": "integer"}}, "required": ["reason", "verdict"]}\n\n\nDo not return any preamble or explanations, return only a pure JSON string surrounded by triple backticks ().\n\nExamples:\n\nquestion: "What can you tell me about albert Albert Einstein?"\ncontext: "Albert Einstein (14 March 1879 – 18 April 1955) was a German-born theoretical physicist, widely held to be one of the greatest and most influential scientists of all time. Best known for developing the theory of relativity, he also made important contributions to quantum mechanics, and was thus a central figure in the revolutionary reshaping of the scientific understanding of nature that modern physics accomplished in the first decades of the twentieth century. His mass–energy equivalence formula E = mc2, which arises from relativity theory, has been called \\"the world\'s most famous equation\\". He received the 1921 Nobel Prize in Physics \\"for his services to theoretical physics, and especially for his discovery of the law of the photoelectric effect\\", a pivotal step in the development of quantum theory. His work is also known for its influence on the philosophy of science. In a 1999 poll of 130 leading physicists worldwide by the British journal Physics World, Einstein was ranked the greatest physicist of all time. His intellectual achievements and originality have made Einstein synonymous with genius."\nanswer: "Albert Einstein born in 14 March 1879 was German-born theoretical physicist, widely held to be one of the greatest and most influential scientists of all time. He received the 1921 Nobel Prize in Physics for his services to theoretical physics. He published 4 papers in 1905. Einstein moved to Switzerland in 1895"\nverification:{"reason": "The provided context was indeed useful in arriving at the given answer. The context includes key information about Albert Einstein\'s life and contributions, which are reflected in the answer.", "verdict": 1}\n\nquestion: "who won 2020 icc world cup?"\ncontext: "The 2022 ICC Men\'s T20 World Cup, held from October 16 to November 13, 2022, in Australia, was the eighth edition of the tournament. Originally scheduled for 2020, it was postponed due to the COVID-19 pandemic. England emerged victorious, defeating Pakistan by five wickets in the final to clinch their second ICC Men\'s T20 World Cup title."\nanswer: "England"\nverification:{"reason": "the context was useful in clarifying the situation regarding the 2020 ICC World Cup and indicating that England was the winner of the tournament that was intended to be held in 2020 but actually took place in 2022.", "verdict": 1}\n\nquestion: "What is the tallest mountain in the world?"\ncontext: "The Andes is the longest continental mountain range in the world, located in South America. It stretches across seven countries and features many of the highest peaks in the Western Hemisphere. The range is known for its diverse ecosystems, including the high-altitude Andean Plateau and the Amazon rainforest."\nanswer: "Mount Everest."\nverification:{"reason": "the provided context discusses the Andes mountain range, which, while impressive, does not include Mount Everest or directly relate to the question about the world\'s tallest mountain.", "verdict": 0}```\n\nYour actual task:\n\nquestion: "When was the first super bowl?"\ncontext: "The First AFL\u2013NFL World Championship Game was an American football game played on January 15, 1967, at the Los Angeles Memorial Coliseum in Los Angeles,"\nanswer: "The first superbowl was held on January 15, 1967"\nverification: \n' additional_kwargs={} response_metadata={}

dosubot[bot] commented 2 months ago

To resolve the ValueError related to the JSON schema when verifying if the context was useful in arriving at the given answer, ensure that the JSON output adheres to the expected schema defined by ContextPrecisionVerification. Here is an example of how to structure the JSON output correctly:

from pydantic import BaseModel
import typing as t

class ContextPrecisionVerification(BaseModel):
    question: str
    context: str
    answer: str
    verification: int

# Example usage
example_verification = ContextPrecisionVerification(
    question="When was the first super bowl?",
    context="""The First AFL–NFL World Championship Game was an American football game played on January 15, 1967, at the Los Angeles Memorial Coliseum in Los Angeles.""",
    answer="The first superbowl was held on January 15, 1967",
    verification=1
)

print(example_verification.json())

Ensure that the verification field is an integer (1 or 0) as specified in the instructions. This should resolve the ValueError related to the JSON schema ^[1].

To continue talking to Dosu, mention @dosu.

explodinggradients / ragas

Exception raised in Job[2]: ValueError(Got unknown type content= #1340

My code is

from langchain_community.chat_models import ChatSparkLLM

重新一键式测评

But the result is wrong,and the evaluation is all Nan.the issue is flowing: