Closed quillan86 closed 1 year ago
I would also be willing to contribute, I would just need a bit of help to know where to put the code? The closest sections seems vector store, but Neo4j is not a vector store, so should it be a retrieval or a tool, or do we just pretend neo4j is a vector store?
There's already these folders of relevance:
langchain.graphs
(storing a barebones networkx graph)langchain.indexes
(storing graphs.py
for the GraphIndexCreator much liek the VectorStore Creator)langchain.chains.graph_qa
(storing a chain for graph QA)So I think it's a matter of reformulating langchain.graphs
to have a base.py et al similar to langchain.vectorstores
. That's why I said interface - it would be the creation of a new general object like Vectorstore
.
We can possibly store the vector embedding portion of Neo4j within the vectorstore one, though, but I'd need to look at the code based on the medium article.
I've already forked the repo and created a branch on my end for this although I haven't pushed changes yet.
Yeah, I wouldn't really add vector search in Neo4j for starters, I would try to add Cypher search first, something like schema based cypher generation, that can be used on any graph:
https://medium.com/neo4j/generating-cypher-queries-with-chatgpt-4-on-any-graph-schema-a57d7082a7e7
Yeah that wouldn't be a priority atm (other than that was a feature of the agent tools I mentioned earlier) - cypher search would be the priority.
I've started the PR, you can take a look
This was added, so you could probably close this issue:
I've started the PR, you can take a look
@tomasonjo Quick question: Does GraphCypherQAChain
works well for you? If yes, what version?
I tried the example in the docs with the current latest version (0.0.197
) but it throws.
Whats the error you are getting?
Whats the error you are getting?
I get the issue now. The LLM simply doesn't respond with a plain Cypher statement, so naturally Neo4jGraph.query()
fails.
Maybe it's because I'm using an Azure LLM instance and it doesn't behave the same (?)
I dont have access to azure llms, so I can't test it. You can ask the llm to wrap the statement in three backticks as the code can extract the statement then
Feature request
There is a need for graph databases to be integrated in langchain. NetworkX isn't suitable for scalable graph databases that would be desired to be queried, particularly with tens of thousands or more nodes and edges. This is necessary for graph databases to compete with vector databases on the level for information extraction within langchain.
There is already a medium article and GitHub repo talking about one way in which this is implemented, but it would be ideal if something like this was integrated into langchain itself. This implementation also has Neo4j as embeddings as an option, which should be implemented as well.
Motivation
The Graph Index Creator and other small forms of graphs within LangChain use NetworkX which isn't scalable for production for full blown knowledge graphs on the size of the vector databases. I know that I have a particular need to use a graph database in production along with langchain due to a work level project.
Your contribution
Yes, I am willing to contribute. I haven't contributed to LangChain directly before but I am familiar with the source code investigating it. Would love to collaborate on what kind of framework/interface we would need to expand graph indexes with a similar scope as vector database indexes.