langchain-ai / langchainjs

🦜🔗 Build context-aware reasoning applications 🦜🔗
https://js.langchain.com/docs/
MIT License
12.66k stars 2.18k forks source link

Neo4jGraph.addGraphDocuments function fails to populate text content in the Document node #5007

Closed bilalshareef closed 7 months ago

bilalshareef commented 7 months ago

Checked other resources

Example Code

import 'neo4j-driver'
import { Neo4jGraph } from '@langchain/community/graphs/neo4j_graph'
import { ChatOpenAI } from '@langchain/openai'
import { LLMGraphTransformer } from './llm-transformer.js'

const url = process.env.NEO4J_URI
const username = process.env.NEO4J_USERNAME
const password = process.env.NEO4J_PASSWORD
const openAIApiKey = process.env.OPENAI_API_KEY

const graph = await Neo4jGraph.initialize({ url, username, password })

const documents = [
  {
    pageContent: `
    In the small town of Willowbrook, nestled between rolling hills and whispering forests, lived a peculiar old man named Elias. He spent his days tending to his garden, whispering secrets to the flowers that bloomed in vibrant hues.
    One chilly autumn morning, a young girl named Lily stumbled upon Elias's garden. Entranced by its beauty, she watched as Elias gently coaxed life from the earth. Intrigued, she approached him.
    Elias smiled warmly, inviting her to explore. Lily discovered that each flower held a story - tales of love, loss, and resilience whispered by the wind itself. She listened, captivated by the wisdom hidden within the petals.
    `,
    metadata: { a: 2 }
  },
  {
    pageContent: `As seasons passed, Lily returned to the garden, sharing her own stories with Elias. They laughed under the summer sun and found solace in the gentle embrace of autumn's breeze.
    One day, Elias's garden began to wither, the vibrant colors fading like memories lost to time. With a heavy heart, Lily sat beside him, holding his weathered hand.
    But as the last leaf fell, Elias whispered, "Life is but a fleeting bloom, my dear. Cherish each moment, for even in the silence, beauty remains."
    With a smile, he closed his eyes, leaving behind a garden filled not just with flowers, but with stories that would bloom for eternity.`,
    metadata: { a: 1 }
  }
]

const llm = new ChatOpenAI({
  temperature: 0,
  modelName: 'gpt-3.5-turbo-0125',
  openAIApiKey
})

const llmTransformer = new LLMGraphTransformer(llm)
const graphDocuments = await llmTransformer.convertToGraphDocuments(documents)

await graph.addGraphDocuments(graphDocuments, {
  baseEntityLabel: true,
  includeSource: true
})

console.log('Completed adding graph documents!!!')

await graph.close()

Error Message and Stack Trace (if applicable)

No response

Description

I am trying to add Graph documents to Neo4j using langchain library. When I invoked graph.addGraphDocuments function, the Document node in Neo4j is not populating the text content for each of the GraphDocument.

The issue looks to be in this line where it should read SET d.text = $document.pageContent instead of SET d.text = $document.page_content.

page_content is valid in python version of the library. But in javascript, it should be pageContent.

System Info

platform - mac Node version - 21.7.1 yarn version - 3.4.1

dosubot[bot] commented 7 months ago

Hey @bilalshareef! 🌟 I'm Dosu, your friendly neighborhood bot here to sprinkle some magic on those pesky bugs, answer your burning questions, and help you navigate the path to becoming a contributor. While we wait for a human to chime in, think of me as your go-to for a little extra support. Let's tackle this together! 🤖✨

Based on your description, it seems like you've identified the issue correctly. The addGraphDocuments function in the Neo4jGraph class is indeed trying to set the document text using SET d.text = $document.page_content, which does not match the property name in your JavaScript objects (pageContent).

To resolve this issue, you should modify the INCLUDE_DOCS_QUERY string in the Neo4jGraph class to use pageContent instead of page_content. This change will align the property name in the query with the property name in your JavaScript objects.

Here's the specific change you need to make:

Original line in INCLUDE_DOCS_QUERY:

SET d.text = $document.page_content 

Modified line:

SET d.text = $document.pageContent

After making this change, the INCLUDE_DOCS_QUERY should look like this:

const INCLUDE_DOCS_QUERY = `
  MERGE (d:Document {id:$document.metadata.id}) 
  SET d.text = $document.pageContent 
  SET d += $document.metadata 
  WITH d 
`;

This modification ensures that when the addGraphDocuments function is called, the Document node's text property is correctly populated with the content from your JavaScript objects' pageContent property.

Please note that this change should be made in your local copy of the LangChain JS library. If you're using a package manager like npm or yarn, you might need to fork the repository, make the change, and then point your package.json to your forked version.

As for the structure of the GraphDocument object in the LangChain JS library, it extends the Document class from @langchain/core/documents and contains three main properties: nodes, relationships, and source. Each Node object has properties like id, type, properties, and lc_namespace. Similarly, each Relationship object has properties like source, target, type, properties, and lc_namespace. The GraphDocument constructor takes an object with nodes, relationships, and source properties to initialize a new instance.

I hope this helps! If you have any further questions or run into any other issues, feel free to ask.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.