docker / genai-stack

Langchain + Docker + Neo4j + Ollama
Creative Commons Zero v1.0 Universal
3.75k stars 794 forks source link

How to use the neo4j database and vector index with imported turtle files for GenAI? #85

Open jli113 opened 10 months ago

jli113 commented 10 months ago

When I imported a turtle model into the neo4j database and started asking questions regarding the file, I did not get the answers that I want, even asking directly of describing a URI, it could not. In fact, the answers are worse than feeding the texts of the turtle file directly to LLM. Model import is through n10s. Cypher queries on neo4j works fine. Cypher queries can also work at http://localhost:8505. Any other questions other than queries cannot return logical answers from the bot. Before GenAI, I also tried Ollama with Langchain to read it as text, it worked fine except the model cannot understand the relationships in the semantic web, which is why I turned to GenAI. The idea of RAG is exactly the way I want to guide LLM for domain knowledge deduction. Maybe I am opening it in a wrong way, the turtle file is from Brick Schema.

jexp commented 10 months ago

Can you share how your imported the turtle-file? (code)

And what your graph model in neo4j looks like (screenshot) ?

You'll also have to create a vector index and index certain text properties of your graph to make the Vector + Graph work.

jli113 commented 10 months ago

Can you share how your imported the turtle-file? (code)

And what your graph model in neo4j looks like (screenshot) ?

You'll also have to create a vector index and index certain text properties of your graph to make the Vector + Graph work.

Sure, the codes are

# first copy file
docker cp dtlab.ttl ContainerName:/home 

# then in neo4j run
CREATE CONSTRAINT n10s_unique_uri ON (r:Resource) ASSERT r.uri IS UNIQUE
CALL n10s.rdf.import.fetch("file:///home/dtlab.ttl", "Turtle")
jexp commented 10 months ago

There’s more missing. the n10s import And the vector index for Rag

jbarrasa commented 10 months ago

It would be useful to get the graph config settings too. Did you go with the defaults? call n10s.graphconfig.init()

Also, could you share the dtlab.ttl file or a fragment of it?

jli113 commented 10 months ago

When I imported a turtle model into the neo4j database and started asking questions regarding the file, I did not get the answers that I want, even asking directly of describing a URI, it could not. In fact, the answers are worse than feeding the texts of the turtle file directly to LLM. Model import is through n10s. Cypher queries on neo4j works fine. Cypher queries can also work at http://localhost:8505. Any other questions other than queries cannot return logical answers from the bot. Before GenAI, I also tried Ollama with Langchain to read it as text, it worked fine except the model cannot understand the relationships in the semantic web, which is why I turned to GenAI. The idea of RAG is exactly the way I want to guide LLM for domain knowledge deduction. Maybe I am opening it in a wrong way, the turtle file is from Brick Schema.

I am using the file here, https://brickschema.org/ttl/gtc_brick.ttl. Graph config is default.

# first copy file
docker cp gtc_brick.ttl genai-stack-database-1:/home

# then in neo4j
CALL n10s.graphconfig.init();
CALL n10s.rdf.import.fetch("file:///home/gtc_brick.ttl", "Turtle", {verifyUriSyntax: false})

Nothing more. I din not find details about vector index for RAG from readme , so I only pressed the import button in http://localhost:8502/

jli113 commented 9 months ago

You'll also have to create a vector index and index certain text properties of your graph to make the Vector + Graph work.

How to go through these steps?

JasonPad19 commented 9 months ago

This is exactly what I am trying to test this weekend.

instead of using the ns10, I find it easy to use RDFLib-neo4j package to push ttl data to neo4j.

I am with you on the same question, how to create vector index for those imported nodes and relationships.

Let me know if you have figured it out. :)