carverauto / threadr

🌎 OSS Real-time AI Data Analysis with GraphDB integration. 🔍
Apache License 2.0
17 stars 1 forks source link

neo4j vector similarity search: contextualize the search to specific nodes (e.g., messages from a particular user) #50

Open mfreeman451 opened 7 months ago

mfreeman451 commented 7 months ago

When perform cosine similarity searches against our vector search index, we can ask questions about the content, that might look like this:

Find messages about oranges

But we are unable to narrow our search when we ask questions like:

Find messages about oranges from Alice

There are other ways to answer these questions by just letting the LLM see all the content from a perfectly crafted Cypher query, but I think without being able to anchor similarity searches against a specific vertice in our graph makes it less useful.

We need a custom retrieval strategy that combines graph traversal with vector search.

Pseudo-code:

// Step 1: Identify messages sent by Alice
MATCH (u:User {name: 'Alice'})-[:SENT]->(m:Message)
WITH collect(m.embedding) AS aliceEmbeddings

// Steps 2 & 3: Assuming an external function vectorSearch that performs the vector search // This part is pseudocode, as actual implementation would depend on your application logic

LET queryVector = encodeQuery("favorite food") // Encode your query into a vector
LET searchResults = vectorSearch(queryVector, aliceEmbeddings) // Perform the search

// Step 4: Process and return the search results // This would involve mapping the search results back to messages and returning relevant information

RETURN searchResults

Helpful links:

https://neo4j.com/developer-blog/neo4j-langchain-vector-index-implementation/ https://towardsdatascience.com/efficient-semantic-search-over-unstructured-text-in-neo4j-8179ad7ff451