run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.24k stars 5.17k forks source link

[Bug]: LLMSingleSelector returning more than one choice #16480

Open nazim-ashman-oc opened 1 week ago

nazim-ashman-oc commented 1 week ago

Bug Description

LLMSingleSelector returns more than one choice sometimes, this causes some of my downstream code to break since it expected one.

With the prompt used (default prompt):

metadata={'prompt_type': <PromptType.SINGLE_SELECT: 'single_select'>} template_vars=['num_choices', 'context_list', 'query_str'] kwargs={} output_parser=<llama_index.core.output_parsers.selection.SelectionOutputParser object at 0xffff7e5af820> template_var_mappings=None function_mappings=None template="Some choices are given below. It is provided in a numbered list (1 to {num_choices}), where each item in the list corresponds to a summary.\n---------------------\n{context_list}\n---------------------\nUsing only the choices above and not prior knowledge, return the choice that is most relevant to the question: '{query_str}'\n"

Example output:

[
    {
        "choice": 2,
        "reason": "The question is asking to compare two entities, which aligns with the description of choice 2."
    },
    {
        "choice": 1,
        "reason": "The question provides specific company names, which aligns with the description of choice 1."
    }
]

Would the fix be to replace _structured_output_to_selector_result with something that takes the first index of the choices returned? Or is it something like validating the response output from the LLM and then sending it back to the LLM for a fix?

Version

0.11.17

Steps to Reproduce

LLMSingleSelector

Relevant Logs/Tracbacks

No response

dosubot[bot] commented 1 week ago

To address the issue where LLMSingleSelector returns more than one choice, you can modify the _structured_output_to_selector_result function to ensure it processes only a single selection. Here's a suggested fix:

def _structured_output_to_selector_result(output: Any) -> SelectorResult:
    """Convert structured output to selector result."""
    structured_output = cast(StructuredOutput, output)
    answers = cast(List[Answer], structured_output.parsed_output)

    # Ensure only one selection is processed
    if len(answers) != 1:
        raise ValueError("Expected exactly one choice for single selection.")

    # Adjust for zero indexing
    selections = [
        SingleSelection(index=answers[0].choice - 1, reason=answers[0].reason)
    ]
    return SelectorResult(selections=selections)

This modification raises an error if the number of choices is not exactly one, ensuring that only a single selection is processed [1][2].

To continue talking to Dosu, mention @dosu.

logan-markewich commented 1 week ago

Replacing the output parser or customizing the prompt are both valid approaches

nazim-ashman-oc commented 1 week ago

Replacing the output parser or customizing the prompt are both valid approaches

Ok, I'm guessing using the something PydanticSingleSelector is more robust? Does this only work with OpenAI or can I use this for other LLMs too?