explodinggradients / ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
https://docs.ragas.io
Apache License 2.0
6.78k stars 668 forks source link

filter_question instruction may be something wrong #1038

Open jaqennnn opened 3 months ago

jaqennnn commented 3 months ago

Asses the given question for clarity and answerability given enough domain knowledge, consider the following criteria:

I think the idea you want to convey is to assesment the question. It is a good idea that reveal all instrutions or prompt.to_string by grammarly.

dosubot[bot] commented 3 months ago

Hey @jaqennnn! I'm here to help you with any bugs, questions, or contributions. Let's get started!

Here's a revised version of the instruction to ensure it clearly conveys the idea of assessing the question for clarity and answerability based on the given criteria:

filter_question_prompt = Prompt(
    name="filter_question",
    instruction="""
Assess the given question for clarity and answerability, considering the following criteria:
1. Independence: Can the question be understood and answered without needing additional context or access to external references not provided within the question itself? Questions should be self-contained, meaning they do not rely on specific documents, tables, or prior knowledge not shared within the question.
2. Clear Intent: Is it clear what type of answer or information the question seeks? The question should convey its purpose without ambiguity, allowing for a direct and relevant response.
Based on these criteria, assign a verdict of "1" if a question is specific, independent, and has a clear intent, making it understandable and answerable based on the details provided. Assign "0" if it fails to meet one or more of these criteria due to vagueness, reliance on external references, or ambiguity in intent.
Provide feedback and a verdict in JSON format, including suggestions for improvement if the question is deemed unclear. Highlight aspects of the question that contribute to its clarity or lack thereof, and offer advice on how it could be reframed or detailed for better understanding and answerability.
""",
    output_format_instruction=get_json_format_instructions(QuestionFilter),
    examples=[
        {
            "question": "What is the discovery about space?",
            "output": QuestionFilter.parse_obj(
                {
                    "feedback": "The question is too vague and broad, asking for a 'discovery about space' without specifying any particular aspect, time frame, or context of interest. This could refer to a wide range of topics, from the discovery of new celestial bodies to advancements in space travel technology. To improve clarity and answerability, the question could specify the type of discovery (e.g., astronomical, technological), the time frame (e.g., recent, historical), or the context (e.g., within a specific research study or space mission).",
                    "verdict": "0",
                }
            ).dict(),
        },
        {
            "question": "How does ALMA-13B-R perform compared to other translation models in the WMT'23 study, based on the results in context1 and context2?",
            "output": QuestionFilter.parse_obj(
                {
                    "feedback": "This question asks for a comparison of the ALMA-13B-R model's performance against other translation models within the WMT'23 study, specifically referring to results in 'context1' and 'context2'. While it clearly specifies the model of interest (ALMA-13B-R) and the study (WMT'23), it assumes access to and understanding of 'context1' and 'context2' without explaining what these contexts entail. This makes the question unclear for those not familiar with the WMT'23 study or these specific contexts. To improve clarity and answerability for a broader audience, the question could benefit from defining or describing 'context1' and 'context2' or explaining the criteria used for comparison in these contexts.",
                    "verdict": "0",
                }
            ).dict(),
        },
        {
            "question": "How do KIWI-XXL and XCOMET compare to the gold standard references in Table 1 in terms of evaluation scores, translation model performance, and success rate in surpassing the references?",
            "output": QuestionFilter.parse_obj(
                {
                    "feedback": "This question is specific and clear, asking for a comparison of KIWI-XXL and XCOMET models against gold standard references in Table 1. It specifies the criteria for comparison (evaluation scores, translation model performance, and success rate), making it understandable and answerable based on the provided details.",
                    "verdict": "1",
                }
            ).dict(),
        },
    ],
)

This version ensures the instruction is clear and detailed, providing a structured approach to assessing questions for clarity and answerability [1].

To continue talking to Dosu, mention @dosu.

jjmachan commented 3 months ago

hey @jaqennnn thanks for pointing it out! Would you want to push a PR for that fix? I would be more than happy to get it merged and improve that prompt if you're interested 🙂