explodinggradients / ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
https://docs.ragas.io
Apache License 2.0
6.77k stars 667 forks source link

Method get_prompts in MetricWithLLM does not return any values #1422

Open f2bo opened 4 hours ago

f2bo commented 4 hours ago

Describe the bug Calling get_prompts for any metric always returns an empty list. I've just started exploring Ragas and haven't found documentation specifically for this method, but I assume that it should return every prompt used to compute a given metric.

I'm not a Python programmer and don't know how this should be corrected, but it seems that the method is looking for members that are instances of the Prompt class. However, the prompts are initialized with Field objects and are not picked up by isinstance(value, Prompt) test.

def get_prompts(self) -> t.Dict[str, Prompt]:
    prompts = {}
    for name, value in inspect.getmembers(self):
       if isinstance(value, Prompt):
           prompts.update({name: value})
    return prompts

Ragas version: 0.1.20 Python version: 3.12.6

Code to Reproduce

from ragas.metrics import faithfulness, answer_correctness, answer_relevancy, context_precision, context_recall
metrics = [faithfulness, answer_correctness, answer_relevancy, context_precision, context_recall]
for metric in metrics:
    print(f"{metric.name} - number of prompts: {len(metric.get_prompts())}")

Output

faithfulness - number of prompts: 0
answer_correctness - number of prompts: 0
answer_relevancy - number of prompts: 0
context_precision - number of prompts: 0
context_recall - number of prompts: 0

Expected behavior For example, when called for the faithfulness metric, I expect get_prompts to return a dictionary containing both nli_statements_message and statement_prompt.

dosubot[bot] commented 4 hours ago

Hey, @f2bo! I'm here to assist you with any bugs, questions, or contributions. Let's work together to solve this issue.

The issue you're encountering with the get_prompts method not returning any values is likely due to the way prompts are being initialized in the MetricWithLLM class. The method is currently checking for instances of the Prompt class directly, but if the prompts are initialized using Field objects, they might not be recognized as Prompt instances.

To address this, you can modify the get_prompts method to check for attributes that are initialized with Field objects and then retrieve the actual Prompt instances from these fields. Here's a possible solution:

def get_prompts(self) -> t.Dict[str, Prompt]:
    prompts = {}
    for name, value in inspect.getmembers(self):
        # Check if the attribute is a Field and has a default value that is a Prompt
        if isinstance(value, Field) and isinstance(value.default, Prompt):
            prompts.update({name: value.default})
    return prompts

This modification checks if the attribute is a Field and if its default value is an instance of Prompt. This should help in correctly identifying and returning the prompts used in the metrics.

To continue talking to Dosu, mention @dosu.

f2bo commented 3 hours ago

@dosu I'm afraid that didn't help as the default value in this case appears to be dataclasses._MISSING_TYPE since the Field is not initialized with a default value, but instead uses a default_factory lambda expression. I'm not sure how the test needs to be adjusted to take this into account.

statement_prompt: Prompt = field(default_factory=lambda: LONG_FORM_ANSWER_PROMPT)
dosubot[bot] commented 3 hours ago

To handle the case where the Field is initialized with a default_factory lambda expression, you need to evaluate the factory to get the actual Prompt instance. Here's how you can adjust the get_prompts method to account for this:

def get_prompts(self) -> t.Dict[str, Prompt]:
    prompts = {}
    for name, value in inspect.getmembers(self):
        # Check if the attribute is a Field
        if isinstance(value, Field):
            # If the Field has a default_factory, evaluate it to get the default value
            if value.default_factory is not dataclasses.MISSING:
                default_value = value.default_factory()
                if isinstance(default_value, Prompt):
                    prompts.update({name: default_value})
    return prompts

This modification checks if the Field has a default_factory and evaluates it to obtain the default value. If the evaluated value is an instance of Prompt, it is added to the prompts dictionary. This should help in correctly identifying and returning the prompts used in the metrics.

To continue talking to Dosu, mention @dosu.

f2bo commented 25 minutes ago

Actually, while the suggestion above appears to be correct in that it will identify any Prompt members in a metric that have been initialized with a Field's default_factory, there seems to be a second problem. The faithfulness metric, and I suppose other metrics, defines its prompts using a different Prompt class than the one being checked by the get_prompts method.

Metric faithfulness uses:

from ragas.llms.prompt import Prompt

Method get_prompts checks for:

from ragas.experimental.llms.prompt import PydanticPrompt as Prompt

If I replace the PydanticPrompt import in the base.py file with the import from ragas.llms.prompt, the method begins to return the expected values. However, I don't understand why there is a different Prompt class and whether this change breaks something else.