langchain-ai / langchain

šŸ¦œšŸ”— Build context-aware reasoning applications
https://python.langchain.com
MIT License
88.76k stars 13.95k forks source link

When using GraphCypherQAChain to fetch documents from Neo4j, the embeddings field is also returned, which consumes all context window tokens #22755

Open liadlevy-pando opened 3 weeks ago

liadlevy-pando commented 3 weeks ago

Checked other resources

Example Code


neo4j_uri = "bolt://localhost:7687"
neo4j_user = "neo4j"
neo4j_password = "....."

graph = Neo4jGraph(
    url=neo4j_uri,
    username=neo4j_user,
    password=neo4j_password,
    database="....",
    enhanced_schema=True,
)

cypher_chain = GraphCypherQAChain.from_llm(
    cypher_llm=AzureChatOpenAI(
        deployment_name="<.......>",
        azure_endpoint="https://.........openai.azure.com/",
        openai_api_key=".....",
        api_version=".....",
        temperature=0
    ),
    qa_llm=AzureChatOpenAI(
        deployment_name="......",
        azure_endpoint="......",
        openai_api_key="....",
        api_version=".....",
        temperature=0
    ),
    graph=graph,
    verbose=True,
)

response = cypher_chain.invoke(
    {"query": "How many tasks do i have"}
)

Error Message and Stack Trace (if applicable)

openai.BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 32768 tokens. However, your messages resulted in 38782 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}

Description

When employing the GraphCypherQAChain.from_llm function, it generates a Cypher query that outputs all properties, including embeddings. Currently, there is no functionality to selectively include or exclude specific properties from the documents, which results in utilizing the entire context window.

System Info

Packages

langchain-community==0.2.2 neo4j==5.18.0/5.19.0/5.20.0 langchain==0.2.2 langchain-core==0.2.4 langchain-openai==0.1.

martinohanlon commented 3 weeks ago

Have you tried providing a prompt to the GraphCypherQAChain with specific instructions to not return the embedding property?

There are some tips in this blog post

liadlevy-pando commented 3 weeks ago

@martinohanlon It works but sometimes it doesn't follow the instructions. Better to have a way to filter out the properties.

Also there are properties that we don't want to send as context I think it could be a good and practical feature.