Evaluation with IBM WatsonX LLM's

Hey @swayam-khandelwal! I'm here to help you with any issues you're facing while waiting for a human maintainer. Let's work together to solve this problem!

The issue you're encountering, "Failed to parse output. Returning None," is likely related to the type of response sent by the WatsonX LLM. The RagasoutputParser class attempts to parse the result using a Pydantic schema. If the parsing fails, it retries once with a fixed output format prompt. If it still fails, it logs the warning and returns None.

To address this issue, ensure that the response from WatsonX LLM conforms to the expected JSON schema. The get_json_format_instructions function provides the necessary instructions for the LLM to format its output correctly. You might need to adjust the prompt or the schema to better align with the responses generated by WatsonX LLM.

Here is the relevant part of the code that handles the parsing and retries:

class RagasoutputParser(PydanticOutputParser):
    async def aparse(  # type: ignore
        self, result: str, prompt: PromptValue, llm: BaseRagasLLM, max_retries: int = 1
    ):
        try:
            output = super().parse(result)
        except OutputParserException:
            if max_retries != 0:
                p_value = FIX_OUTPUT_FORMAT.format(
                    prompt=prompt.to_string(), completion=result
                )
                output = await llm.generate(p_value)
                result = output.generations[0][0].text
                return await self.aparse(result, prompt, llm, max_retries - 1)
            else:
                logger.warning("Failed to parse output. Returning None.")
                return None
        return output

Ensure that the WatsonX LLM's output is a well-formatted JSON string that matches the schema provided in the prompt instructions. Adjusting the prompt to include clear formatting instructions, as shown in the JSON_FORMAT_INSTRUCTIONS, can help mitigate this issue ^[1].

To continue talking to Dosu, mention @dosu.

explodinggradients / ragas

Evaluation with IBM WatsonX LLM's #1186