beatG123 commented 2 months ago

[ ] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug ValueError: The metric [context_precision] that that is used requires the following additional columns ['reference'] to be present in the dataset.

Ragas version:0.1.18 Python version:3.10.9

Code to Reproduce from agent import * from datasets import Dataset import json import random

agent = Agent()

读取测试问答对

input_file_name = os.path.join(PROJECT_PATH, 'data/test_data/qa_pair.json') with open(input_file_name, 'r', encoding='utf-8') as f: data_list = json.load(f)

抽取个

random_data = random.sample(data_list, 1)

questions = [] ground_truths = []

for pair in random_data: questions.append(pair['query']) ground_truths.append([pair['ground_truths']])

answers = [] contexts = []

生成答案和上下文

for i,query in enumerate(questions):

answers.append(agent.query(query))

a=agent.retrival_func('',query)
answers.append(a)
print(f"【问题】：{query}")
print(f"【答案】：{a}")
print(f"【真实答案】：{ground_truths[i]}")
contexts.append(agent.retriver_test(query,3))

构建测试数据结构

data = { "question": questions, "answer": answers, "contexts": contexts, "ground_truths": ground_truths } print(data) dataset = Dataset.from_dict(data) print(dataset)

print(type(data['question']))

print(type(data['answer']))

print(type(data['contexts']))

print(type(data['ground_truths']))

使用RAGAs进行评估

from ragas import evaluate from ragas.metrics import ( faithfulness, answer_relevancy, context_recall, context_precision, )

from ragas.llms import LangchainLLMWrapper llm_instance = LangchainLLMWrapper(ChatOpenAI()) faithfulness.llm = llm_instance

faithfulness.nli_statements_message.language = "chinese" faithfulness.statement_prompt.language = "chinese" faithfulness.adapt(language="chinese")

faithfulness.save()

result = evaluate( dataset, metrics=[ context_precision, context_recall, faithfulness, answer_relevancy, ], )

print(result)

print(result.to_pandas())

#

import pandas as pd

#

设置选项以显示所有行和列

pd.set_option('display.max_rows', None)

pd.set_option('display.max_columns', None)

pd.set_option('display.width', None) # 或者设置一个足够大的宽度值

pd.set_option('display.max_colwidth', None)

#

df = result.to_pandas()

from tabulate import tabulate

print(tabulate(df, headers='keys', tablefmt='psql', showindex=False))

Error trace Traceback (most recent call last): File "D:.study\myProject\ragas_eval_2.py", line 71, in result = evaluate( File "F:\Application2022\Anaconda\an\lib\site-packages\ragas_analytics.py", line 129, in wrapper result = func(*args, kwargs) File "F:\Application2022\Anaconda\an\lib\site-packages\ragas\evaluation.py", line 177, in evaluate validate_required_columns(dataset, metrics) File "F:\Application2022\Anaconda\an\lib\site-packages\ragas\validation.py", line 62, in validate_required_columns raise ValueError( ValueError: The metric [context_precision] that that is used requires the following additional columns ['reference'] to be present in the dataset. Expected behavior** {'question': ['上市公司控股股东、实际控制人不得以什么方式侵占上市公司利益？'], 'answer': ['上市公司控股股东、实际控制人不得以资金占用、违规担保等方式侵占上市公司利益。这些行为会严重损害上市公司的利益，若构成违规事实，相关责任单位将依法监管，分类处置，并限期予以清偿或化解，甚至对构成犯罪的行为依法追究刑事责任。\n\n数据来源：青岛市人民政府关于进一步提高上市公司质量的实施意见。'], 'contexts': [['际控制人、董事、监事和高级管理人员明悉职责界限和法律责任。开展公司治理专项行\n动，强化底线要求，倡导最佳实践。结合国企改革三年行动，健全国有控股上市公司治理\n机制，强化对无实际控制人上市公司的监管。支持投资者保护机构持股行权。督导上市公\n司加强投资者关系管理。严格执行上市公司内控制度，披露年度自我评价报告。规范内幕\n信息知情人登记和报送行为，加强内幕交易综合防控。加强内部控制审计业务监管。（责\n任单位：青岛证监局，市国资委、市公安局，青岛银保监局）\n（二）提升信息披露质量。落实以信息披露为核心的注册制理念，持续开展法规政策培', '资金占用、违规担保等损害上市公司利益的行为。对已构成违规事实的, 坚持依法监管、\n分类处置，限期予以清偿或化解；严厉查处限期未整改或新发生的资金占用、违规担保问\n题，构成犯罪的依法追究刑事责任。（责任单位：市公安局、市国资委，青岛证监局、市\n中级法院，各区、市政府）\n（九）加强对退市风险公司的监管。严格执行退市制度，对被交易所采取退市风险警示的\n上市公司加强风险监测，严厉打击通过财务造假、利益输送、操纵市场等方式恶意规避退\n市行为。支持上市公司通过并购重组、破产重整等方式出清风险。（责任单位：青岛证监']], 'ground_truths': [['上市公司控股股东、实际控制人不得以任何方式侵占上市公司利益。']]} Dataset({ features: ['question', 'answer', 'contexts', 'ground_truths'], num_rows: 1 }) This is the format of my data. I want to use Chinese data for evaluation, but the aforementioned error occurred. What should I do?

Additional context

dosubot[bot] commented 2 months ago

To resolve the ValueError related to the 'context_precision' metric requiring the 'reference' column when using Chinese data for evaluation in Ragas, you need to ensure that the 'reference' column is included in your dataset. The ContextPrecision class requires the 'reference' column as part of its _required_columns attribute for the MetricType.SINGLE_TURN metric type.

Here is the relevant part of the code that specifies the required columns:

@dataclass
class ContextPrecision(MetricWithLLM, SingleTurnMetric):
    name: str = "context_precision"  # type: ignore
    _required_columns: t.Dict[MetricType, t.Set[str]] = field(
        default_factory=lambda: {
            MetricType.SINGLE_TURN: {
                "user_input",
                "retrieved_contexts",
                "reference",
            }
        }
    )
    ...

To fix the error, make sure your dataset includes the 'reference' column. Here is an example of how your dataset should look:

[
    {
        "user_input": "你的问题是什么？",
        "retrieved_contexts": ["这是一个示例上下文。"],
        "reference": "这是一个参考答案。"
    },
    ...
]

Ensure that each entry in your dataset includes the 'user_input', 'retrieved_contexts', and 'reference' fields. This should resolve the ValueError related to the missing 'reference' column ^[1].

To continue talking to Dosu, mention @dosu.

beatG123 commented 2 months ago

I am unable to do that as it involves underlying work. Please provide a reasonable solution after deep consideration.

dosubot[bot] commented 2 months ago

A reasonable solution to resolve the ValueError without involving underlying work is to modify or extend the ContextPrecision metric to bypass the requirement for the 'reference' column. You can create a new class that inherits from ContextPrecision and override the _required_columns attribute and the _get_row_attributes method to exclude the 'reference' column.

Here is an example of how you can do this:

from ragas.metrics import ContextPrecision, MetricType
from dataclasses import dataclass, field
import typing as t

@dataclass
class CustomContextPrecision(ContextPrecision):
    name: str = "custom_context_precision"
    _required_columns: t.Dict[MetricType, t.Set[str]] = field(
        default_factory=lambda: {
            MetricType.SINGLE_TURN: {"user_input", "retrieved_contexts"}
        }
    )

    def _get_row_attributes(self, row: t.Dict) -> t.Tuple[str, t.List[str]]:
        return row["user_input"], row["retrieved_contexts"]

# Use the custom metric for evaluation
result = evaluate(
    dataset,
    metrics=[
        CustomContextPrecision(),
        context_recall,
        faithfulness,
        answer_relevancy,
    ],
)

print(result)

This custom class CustomContextPrecision modifies the _required_columns to include only 'user_input' and 'retrieved_contexts', and updates the _get_row_attributes method to return these attributes, effectively bypassing the requirement for the 'reference' column ^[1].

To continue talking to Dosu, mention @dosu.

beatG123 commented 2 months ago

Modifying or extending the ContextPrecision indicator to bypass the requirement for the "reference" column is only a temporary solution and does not address the long-term issue. Please use a packaged and encapsulated method instead.

dosubot[bot] commented 2 months ago

To resolve the ValueError related to the 'context_precision' metric requiring the 'reference' column in a long-term and packaged way without modifying or extending the ContextPrecision class, you need to ensure that the input data includes the 'reference' column. This column is required by the _get_row_attributes method in the ContextPrecision class.

Here is an example of how you can prepare your input data to include the 'reference' column:

# Example input data
input_data = {
    "user_input": "What can you tell me about Albert Einstein?",
    "retrieved_contexts": [
        "Albert Einstein (14 March 1879 – 18 April 1955) was a German-born theoretical physicist, widely held to be one of the greatest and most influential scientists of all time. Best known for developing the theory of relativity, he also made important contributions to quantum mechanics, and was thus a central figure in the revolutionary reshaping of the scientific understanding of nature that modern physics accomplished in the first decades of the twentieth century. His mass–energy equivalence formula E = mc2, which arises from relativity theory, has been called 'the world's most famous equation'. He received the 1921 Nobel Prize in Physics 'for his services to theoretical physics, and especially for his discovery of the law of the photoelectric effect', a pivotal step in the development of quantum theory. His work is also known for its influence on the philosophy of science. In a 1999 poll of 130 leading physicists worldwide by the British journal Physics World, Einstein was ranked the greatest physicist of all time. His intellectual achievements and originality have made Einstein synonymous with genius."
    ],
    "reference": "Albert Einstein born in 14 March 1879 was German-born theoretical physicist, widely held to be one of the greatest and most influential scientists of all time. He received the 1921 Nobel Prize in Physics for his services to theoretical physics. He published 4 papers in 1905. Einstein moved to Switzerland in 1895."
}

# Ensure the input data includes the 'reference' column
context_precision = ContextPrecision()
score = await context_precision._ascore(input_data, callbacks=None)
print(score)

By including the 'reference' column in your input data, you can avoid the ValueError and ensure that the ContextPrecision class functions correctly ^[1].

To continue talking to Dosu, mention @dosu.

shahules786 commented 2 months ago

Hi @beatG123 ContextPrecision required reference (v0.2 naming)/ground_truth(v0.1 naming) to be present in the dataset, if you don't have it please switch to ContextUtilization https://docs.ragas.io/en/stable/concepts/metrics/context_utilization.html

if that's not the case can you make sure that you are using the latest version of ragas? There was a bug in 0.017 release

shahules786 commented 2 months ago


from ragas.metrics import context_precision, faithfulness, context_recall, answer_relevancy, answer_similarity
from ragas import evaluate
from datasets import Dataset

data_samples = {
    'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
    'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'],
    'contexts' : [['The First AFL–NFL World Championship Game was an American football game played on January 15, 1967, at the Los Angeles Memorial Coliseum in Los Angeles,'], 
    ['The Green Bay Packers...Green Bay, Wisconsin.','The Packers compete...Football Conference']],
    'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
}
dataset = Dataset.from_dict(data_samples)
score = evaluate(dataset,metrics=[context_precision])
score.to_pandas()```

explodinggradients / ragas

ValueError: The metric [context_precision] that that is used requires the following additional columns ['reference'] to be present in the dataset. #1299

读取测试问答对

抽取个

生成答案和上下文

answers.append(agent.query(query))

构建测试数据结构

print(type(data['question']))

print(type(data['answer']))

print(type(data['contexts']))

print(type(data['ground_truths']))

使用RAGAs进行评估

faithfulness.save()

print(result.to_pandas())

import pandas as pd

设置选项以显示所有行和列

pd.set_option('display.max_rows', None)

pd.set_option('display.max_columns', None)

pd.set_option('display.width', None) # 或者设置一个足够大的宽度值

pd.set_option('display.max_colwidth', None)

df = result.to_pandas()

from tabulate import tabulate

print(tabulate(df, headers='keys', tablefmt='psql', showindex=False))

explodinggradients / ragas

ValueError: The metric [context_precision] that that is used requires the following additional columns ['reference'] to be present in the dataset. #1299

读取测试问答对

抽取 个

生成答案和上下文

answers.append(agent.query(query))

构建测试数据结构

print(type(data['question']))

print(type(data['answer']))

print(type(data['contexts']))

print(type(data['ground_truths']))

使用RAGAs进行评估

faithfulness.save()

print(result.to_pandas())

import pandas as pd

设置选项以显示所有行和列

pd.set_option('display.max_rows', None)

pd.set_option('display.max_columns', None)

pd.set_option('display.width', None) # 或者设置一个足够大的宽度值

pd.set_option('display.max_colwidth', None)

df = result.to_pandas()

from tabulate import tabulate

print(tabulate(df, headers='keys', tablefmt='psql', showindex=False))

抽取个