run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.8k stars 5.28k forks source link

[Bug]: Problem getting pydantic output out of Llama Index #13953

Closed fbpo23 closed 5 months ago

fbpo23 commented 5 months ago

Bug Description

I am trying to get output from llama index's query engine, however the output is coming out empty. I tried diving into llama index's source code to find out the problem and I found that the response is being correctly retrieved from the LLM as JSON, but is getting wiped out both with the SynthesizeEndEvent and in the QueryEndEvent (the latter was verified by commenting the SynthesizeEndEvent so that the response reaches the QueryEndEvent intact).

If I comment both of these events, the output is correctly parsed as the output_cls I defined on the query engine.

I am using AzureOpenAI as the LLM and Hugging face embedding using the sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 transformer.

To debug it out further, I tried going into the dispatcher code and try to print out the name of the event that was being triggered and the handler that was handling it, in hopes on finding out which code part was causing this. Furthermore, I also printed out the response before and after those events. the result is the following:

Event name: RetrievalStartEvent| Handler name: NullEventHandler
Event name: EmbeddingStartEvent| Handler name: NullEventHandler
Event name: EmbeddingEndEvent| Handler name: NullEventHandler
Event name: RetrievalEndEvent| Handler name: NullEventHandler
Event name: SynthesizeStartEvent| Handler name: NullEventHandler
Event name: GetResponseStartEvent| Handler name: NullEventHandler
Event name: LLMStructuredPredictStartEvent| Handler name: NullEventHandler
Event name: LLMChatStartEvent| Handler name: NullEventHandler
Event name: LLMChatEndEvent| Handler name: NullEventHandler
Event name: LLMStructuredPredictEndEvent| Handler name: NullEventHandler
Event name: GetResponseEndEvent| Handler name: NullEventHandler
Response before event: {"is_question":true,"has_context_to_answer":true,"response":"Edison was an American inventor."}
Event name: SynthesizeEndEvent| Handler name: NullEventHandler
Response after event: {}
Query result before event: {}
Event name: QueryEndEvent| Handler name: NullEventHandler
Query result after event: {}

Investigating into this NullEventHandler, it seemed to me that it was a template event handler, and it had no logic in it, therefore it couldn't really affect the response. But it somehow was.

Which lead me into writing this issue. Any suggestions?

Version

llama-index-agent-openai==0.2.7 llama-index-core==0.10.43 llama-index-embeddings-huggingface==0.2.1 llama-index-llms-azure-openai==0.1.8 llama-index-llms-openai==0.1.22 llama-index-readers-file==0.1.23 llama-index-vector-stores-postgres==0.1.7

Steps to Reproduce

Run the following code and put some txt in the data folder with some information for RAG:

from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings
from llama_index.llms.azure_openai import AzureOpenAI
import os
from pydantic import BaseModel

class ResponseModel(BaseModel):
    """Data model for response."""

    is_question: bool
    has_context_to_answer: bool
    response: str

Settings.embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2")
documents = SimpleDirectoryReader("../data").load_data()
index = VectorStoreIndex.from_documents(documents)

Settings.llm = AzureOpenAI(
    deployment_name=os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME"),
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    temperature=0,
    additional_kwargs={
        "stop": "\n\n",
    },
)

query_engine = index.as_query_engine(output_cls=ResponseModel, response_mode="compact")
response = query_engine.query("Who is Edison?")
print(response)

With this code the output comes out empty. However, if I comment lines 53-55 of llama_index.core.base.base_query_engine.py (QueryEndEvent) and lines 255-260 of llama_index.core.response_synthesizers.base.py (SynthesizeEndEvent) The output come out filled as it is supposed to.

Relevant Logs/Tracbacks

No response

dosubot[bot] commented 5 months ago

Hey there, @fbpo23! I'm here to help you with any issues you're facing. I can assist with bug fixes, answer questions, and guide you on becoming a contributor. Let's work together to resolve the problem you've encountered.

The issue you're encountering with the SynthesizeEndEvent and QueryEndEvent wiping out the response in Llama Index is indeed puzzling, especially since the NullEventHandler is supposed to be a no-op. Here are a few steps and considerations to help you debug and potentially resolve this issue:

  1. Ensure Proper Event Handling:

    • Verify that no other event handlers are inadvertently modifying the response. The NullEventHandler should not affect the response, but ensure that no other handlers are registered for these events.
  2. Check for Side Effects:

    • Even though NullEventHandler does nothing, ensure that there are no side effects or state changes happening elsewhere in the code that might be triggered by these events.
  3. Debugging Steps:

    • Add more detailed logging around the SynthesizeEndEvent and QueryEndEvent to capture the state of the response before and after these events. This might help pinpoint where the response is being altered.
  4. Review Event Dispatching Logic:

    • Ensure that the event dispatching logic is correctly implemented and that the response is not being inadvertently reset or cleared during event handling.
  5. Alternative Approach:

    • As a temporary workaround, you can continue commenting out the lines related to SynthesizeEndEvent and QueryEndEvent if they are not critical for your use case. However, this is not a long-term solution.

Here is a snippet of how you might add more detailed logging:

# In llama_index.core.base.base_query_engine.py
# Around lines 53-55
print(f"Response before QueryEndEvent: {response}")
# Existing code for QueryEndEvent
print(f"Response after QueryEndEvent: {response}")

# In llama_index.core.response_synthesizers.base.py
# Around lines 255-260
print(f"Response before SynthesizeEndEvent: {response}")
# Existing code for SynthesizeEndEvent
print(f"Response after SynthesizeEndEvent: {response}")

Additionally, ensure that the output_cls parameter is correctly defined and that the response conforms to the expected structure of the ResponseModel class.

If the issue persists, consider creating a minimal reproducible example and sharing it with the maintainers for further investigation. This will help them understand the context and provide a more targeted solution.

To continue talking to Dosu, mention @dosu.

logan-markewich commented 5 months ago

@fbpo23 try from pydantic.v1 import BaseModel instead

fbpo23 commented 5 months ago

@fbpo23 try from pydantic.v1 import BaseModel instead

That worked! Thanks a lot. I am curious though, why would the response get empty after that event, if it has no logic inside of it?

logan-markewich commented 5 months ago

I have no idea -- I think it's related to how we internally use the pydantic v1 bridge, and there's some weird pydantic magic happening when classes from v1 and v2 touch each other

logan-markewich commented 5 months ago

Hoping to move to v2 soon 🙏🏻

fbpo23 commented 5 months ago

I see. Thanks for the help!